What a source is in Bulkgrid
A source is a first-class product object, not just a URL passed into a one-off crawl. The current source API supports domain-type sources with configuration such as:identifierlabelvisibilitysource_modecrawl_configcrawl_intervalcustom_interval_minutes
Source modes
Current source modes include:discoverselected_pages
Source lifecycle surfaces
For a source, the product supports:- source details
- source status
- source folders
- source documents
- source changes
- source runs
- manual recrawl
What to decide before you crawl
Domain Scope
Decide which domains are in scope and which are never allowed.
Path Rules
Define which paths are included and which should always be excluded.
Document Links
Decide whether linked documents belong in the same ingestion flow.
Knowledge Boundaries
Keep support, marketing, and internal knowledge separated when needed.
Practical recommendation
Start with a small, high-value source boundary. Expand only after you validate retrieval quality and operational behavior. For domain sources, Bulkgrid normalizes the identifier to the URL origin when it creates the source. That meanshttps://docs.example.com/foo and https://docs.example.com/bar are treated as the same source root.