Whitespace Analysis Guide

← Back to Help

Go to Whitespace →

The Whitespace Analysis page is a platform for overview and insights directed to input search criteria. Analysis information displayed on this page can be searched and sorted in multiple ways, providing a dynamic and flexible interface ideal for AI/ML prior art searches, competitive landscape monitoring, underexplored technology areas conducive to R&D innovation, and more.

Analysis and insights available on this page are generated based on user-defined focus keywords, CPC filters, and date ranges. The overview section provides a high-level summary of key metrics, while the tables and charts allow for in-depth exploration of patent filings relevant to the specified criteria. For example, key metrics include subject matter saturation, patent and publication activity rates and momentum, and CPC trends for specific search criteria and semantically similar concepts.

What the Whitespace Overview Measures

For any combination of focus keywords, CPC filters, and date range the overview performs the following steps:

  1. Builds a target search set using full-text search over Patent Scout's relational database, including exact matches and, when enabled, semantic nearest neighbors.
  2. Counts distinct patents and publications in the target search set (exact, semantic, and combined) and normalizes volume per month.
  3. Tracks monthly patent grants and publications to measure growth trends, such as slope and compound annual growth rate (CAGR), and classifies momentum as rising, stable, or declining.
  4. Aggregates CPC classifications to show the top slices and a broader breakdown for adjacent technology clusters.
  5. Summarizes recency (6/12/18/24 month totals) and, when available, tags saturation with a percentile vs. historical queries.
  6. With “Group by Assignee” enabled, builds an embedding graph and signal cards per assignee.

Inputs & Toggles

At least one focus input (keywords or CPC) is required. The default window covers the past 24 months, anchored to the end date.

Focus Keywords

Use comma-separated words and/or phrases that describe the AI/ML subject matter of interest. The analysis reflects keyword, key phrase, and semantic search matches (when enabled) found in the title, abstract, and claims of patents and publications.

Example: foundation models, multi-modal reasoning, retrieval augmented generation

Notes:
  • Broader keywords and phrases produce a greater result set; combine with CPC filters to narrow the result set to a specific technology area.
  • Focus keywords and phrases are used to obtain both exact matches and semantic nearest neighbors (when enabled).

CPC Filter

Concentrate the target search set in a specific technology area with CPC classification codes. Supports partial codes such as G06N and full designations like G06F17/30.

Example: G06N20/00, A61B5, G06V, G06K9/00

Notes:
  • Multiple CPC filters are OR’ed together; combine with keywords for precise intersections.
  • Use broader prefixes (e.g., G06N) to capture related subgroups when exploring adjacent domains.

Date Range

Restrict the results set to a specific time range corresponding to patent grant date or publication date. Empty fields fall back to the full data set in Patent Scout's database. When only an end date is provided the start defaults to 23 months earlier.

Example: From 2023-07-01, To 2025-06-30

Notes:
  • Shorter ranges can highlight current activity (e.g., competitors' R&D and investment areas); longer ranges provide more stable density and percentile signals.
  • Momentum uses the monthly series inside the selected window.

Show Semantic Neighbors (toggle)

When enabled, whitespace analysis matches semantic nearest neighbors (based on embedding index) and merges those with exact keyword and phrase matches.

Default: Enabled

Notes:
  • Disable to receive only literal keyword matches (useful for prior art searches).
  • Semantic nearest neighbors follow the same date and CPC filters after results are returned from semantic search.

Group by Assignee (toggle)

Loads more complex, weighted whitespace signals calculated per assignee. Includes context graph, assignee signal cards, and Sigma visualization beneath the overview. Off by default.

Default: Disabled

Notes:
  • Enable to view focus convergence / subject matter oversaturation signals tied to specific assignees.
  • Toggling on after a run reuses the most recent search parameters (rerunning the search is unnecessary).

Interpreting the Overview Tiles

Saturation

Exact, semantic, and total distinct publication counts inside the window.

  • Review exact vs. semantic to see how literal the coverage is.
  • Activate rate (e.g., new patent grants and publications per month) is a function of total count divided by the number of months (window defaults to 24).
  • Percentile (when present) maps into Low / Medium / High / Very High guidance.

Activity Rate

Average filings per month plus the observed min/max band.

  • Together with the timeline (described infra), patent grant/publication activity rates can expose volatility inside a date range.
  • High activity rate coupled with a narrow band indicates steady and consistent emphasis on obtaining IP protection for AI/ML innovations.

Momentum

Slope of the monthly time series and CAGR over the window.

  • Momentum bucket: Up (> +0.05), Down (< -0.05), otherwise Flat.
  • Slope is normalized by average volume; CAGR can be used to contextualize growth rate.

Top CPCs

Highest volume CPC codes among matched filings.

  • Shows the leading technology slices at a glance.
  • Detailed CPC bar chart includes links to comprehensive and specific definitions for each CPC code, including those outside the top five.

Timeline & CPC Distribution

Timeline sparkline

Plots monthly publication counts across the selected window. Hover in the UI to inspect the exact month totals. Sharp inflections may indicate changes in momentum.

CPC trend chart

Ranks CPC codes by patent and publication volume. A shorter bar generally corresponds to a less explored technology area, whereas a longer bar may suggest a more developed or saturated technology area.

Recent intervals

Summaries for the last 6, 12, 18, and 24 months. This information can be read with near-term patent and publication velocity against historical averages.

Patent & Publication Table

Result set table lists up to 1000 patents and publications per target search set, sortable on recency, relevance, or assignee name. Click any patent/publication number to open the document in a new tab.

Result set table can be exported as a PDF (up to 1000 patents and publications) for later reference and review. The exported PDF includes the overview and analysis displayed above on the page.

Title:Patent or publication title.
Abstract:The abstract of the patent or publication, truncated to 200 characters.
Patent/Pub No.:USPTO patent or publication identifier with kind code. Links to Google Patents.
Assignee:Canonicalized assignee name when available; 'Unknown' if not present.
Grant/Pub Date:Patent grant or publication date, formatted as YYYY-MM-DD.
CPC:Top CPC codes (section/class/subclass/group) used for classifying the patent or publication (up to four).

Optional Assignee Signals

Switching on “Group by Assignee” augments the whitespace analysis with a per-assignee clustering view. More complex, weighted signals are calculated from semantic embeddings, which are used to build a cosine KNN graph and evaluate four signals per grouping:

  • Potential Gap: Opportunity where an assignee may have some IP protection, but neighboring clusters are underexplored.
  • Bridging Opportunity: Cross-cluster connectors with lower momentum on both sides.
  • Focus Convergence: Risk indicator showing an assignee with IP protection trending very close to the target search set.
  • Crowd-out: Risk indicator where local density and momentum around the target search set is sharply rising.

Toggle on "Group by Assignee" to generate specific analysis and insights scoped to specific entities (e.g., competitors, investors in the AI/ML space, etc.).

Example Workflow

  1. Start with focus keywords and phrases over a 24-month window to gauge baseline saturation and momentum.
  2. Narrow the target search set with CPC filters to concentrate on specific technology areas of interest.
  3. Timeline graph can be used to confirm that momentum is accurately labeled (e.g., rising, declining, etc.).
  4. Result set table provides a quick and comprehensive reference list of patents and publications germane to the target search set.
  5. (Optional) Enabling "Group by Assignee" can provide a more granular view of R&D activity and investment for specific entities (e.g., competitors, investors in the AI/ML space, etc.).

Best Practices

Anchor the window to a strategic milestone

Shift the end date to align with product launches or regulatory moments. Comparing periods highlights whether filings are accelerating into that milestone.

Compare exact vs. semantic saturation

A smaller gap between exact-match results and semantic-search results indicates the target search set is well-aligned with conventional terminology used across the domain. Large gaps between exact-match results and semantic-search results indicate that the relevant concepts are often expressed in different wording than the target search set. That is, the domain uses diverse terminology or synonyms not captured by the literal query. Expanding upon keywords and phrases (e.g., using synonyms, abbreviations, etc.) and/or adding CPC filters can help refine the target search set.

Monitor CPC drift

When small keyword changes cause noticeable shifts in the CPC distribution, the overall concept likely spans multiple technology areas. Depending on the goal, this may be a signal to explore the concept in more granular clusters.

Troubleshooting

Issue: No results returned

Solution: Verify at least one keyword or CPC is provided. Try expanding the date range or disabling semantic neighbors if the query is very niche.

Issue: Momentum stays flat

Solution: Check the timeline sparkline for month-to-month variability. Extending the window or adding semantic neighbors can expose greater insights.

Issue: Assignee graph looks empty

Solution: Ensure “Group by Assignee” is toggled on and the latest run completed. Some narrow scopes may lack a sufficient number of patents and publications per assignee to expose a signal with that satisfies a minimum level of confidence.