
Overview
Key advantages
- No query translation: No query translation logic required
- Performant: Eliminates synchronous network calls to the target API
- Feature-complete: Charts, filtering, and search work out of the box
- Flexible: Implement custom logic for fetching target API data
- Robust: Recover bad states by reconstructing the replica from scratch
Minimal implementation
Node.js
Known limitations & solutions
| Limitation | Solution |
|---|---|
| Full data dump required at each startup | Implement persistent cache |
| Empty collections and foreign keys not auto-detected | Provide explicit schema definition |
| Data never updates after initial import | Implement update handlers |
| Read-only data | Implement write handlers |
| Nested fields and arrays in API responses | Use record flattener utility |
Persistent cache
The Forest Node.js back-end uses a SQL database as its underlying cache mechanism. By default, an in-memory SQLite database is used.Limitations of in-memory cache
The default in-memory approach presents two main challenges:- Extended startup time: The back-end must re-fetch all data from the target API on each restart
- High memory consumption: All data remains in memory, which becomes problematic for large datasets
When to use persistent cache
Depending on which API you are targeting, it may be absolutely fine to use an in-memory cache for smaller datasets. However, larger systems like CRMs or databases containing millions of records benefit significantly from persistent storage.Cache initialization
Forest will automatically detect when the schema of the tables in the caching database does not match the schema of the target API. When mismatches occur, tables and indexes are dropped, recreated, and repopulated from the target API.Configuration options
cacheInto: Accepts a connection string or configuration object for the SQL connectorcacheNamespace: Prefixes table names, useful for sharing databases or running multiple replicas
SQLite file example
Node.js
PostgreSQL example
Node.js
Updating the replica
Real-world scenarios require keeping the Forest back-end to display up-to-date data.Three update methods
Use these approaches independently or combine them:- Scheduled rebuilds - Refetch all records periodically
- Change polling - Uses Forest events to detect modifications
- Change pushing - Leverages target API events via webhooks
Scheduled rebuilds
Scheduled rebuilds represent the simplest approach for updating replica data by fetching all records from a target API at regular intervals. This method works with any API but is less efficient for large datasets since it requires fetching all records regardless of changes.Configuration options
pullDumpOnRestart: When set to true, data fetches on each back-end startup. This is always enabled for default in-memory cache implementations.
pullDumpOnSchedule: Accepts cron-like schedule patterns for periodic updates. For example: ['0 0 0 * * *', '0 30 18 * * *'] triggers daily at midnight and 6:30 PM.
Schedule syntax
The system uses the croner NPM package for schedule parsing with this format:* * * * * *- Every second0 * * * * *- Every minute0 0 9 * * 1- Mondays at 9am
Handler implementation
ThepullDumpHandler returns entries for import and supports pagination. The request object provides previousDumpState (for change detection), cache access, and reasons (startup/schedule triggers).
The response object specifies entries to import, pagination via more flag, and state persistence through nextDumpState and nextDeltaState fields.
Key advantage: Old data remains available to users until new data processing completes, preventing service disruption.
Change polling
Change polling is a strategy for updating replica data sources by fetching only records that have changed, rather than pulling all data from the target API on each update.When to poll for changes
Four triggering events are available:- pullDeltaOnRestart: Handler executes when the back-end restarts
- pullDeltaOnSchedule: Handler runs on a cron-like schedule (same syntax as pullDumpOnSchedule)
- pullDeltaOnBeforeAccess: Handler executes before each datasource access; GUI blocks until completion
- pullDeltaOnAfterWrite: Handler executes after each write operation; GUI blocks until completion
pullDeltaOnBeforeAccessDelay (milliseconds) groups multiple requests sent during the delay period, reducing calls to your target API. Set to 0 to disable.
Handler implementation
Implement apullDeltaHandler function that receives a request object containing:
previousDeltaState: Persisted state from previous callsaffectedCollections: Collections being accessed or written tocache: Interface for reading cached datareasons: Array explaining why the handler was invoked
more: Boolean indicating if additional changes exist (triggers immediate re-call)nextDeltaState: State persisted for subsequent handler invocationsnewOrUpdatedEntries: Records created or modified since last calldeletedEntries: Records removed since last call
Push & webhooks
The push strategy keeps replicas up-to-date when APIs expose change-following capabilities through webhooks, WebSockets, long polling, or similar mechanisms.Handler programming
Unlike the pull strategy, developers are responsible for setting up subscriptions to the target API. The back-end calls your handler during startup to establish these subscriptions, and you send changes to the back-end for replica updates.Request object structure
The request provides:getPreviousDeltaState(): Fetches delta state asynchronously, useful when mixing push and pull strategiescache: Interface for reading from the cache
onChange payload structure
The payload includes:nextDeltaState(optional): Updated delta state for recovery on back-end restartnewOrUpdatedEntries: Array of created/updated records with collection and record datadeletedEntries: Array of deleted records (full record not required)
Example: CouchDB change feed
Using the nano library to subscribe to CouchDB’s changes stream:Node.js
Example: webhook implementation
Using Express to receive webhooks on a separate port:Node.js
Schema & references
Schema auto-discovery
When no explicit schema is provided, the back-end attempts to auto-discover structure from imported data. However, this approach has limitations:- Empty collections cannot be imported
- Performance overhead from sampling data
- Primary keys must be named
id - Composite primary keys unsupported
- Foreign keys aren’t automatically detected
Providing a schema
Supply a schema via thecreateReplicaDataSource function to avoid auto-discovery limitations. The schema can be static or dynamically generated through Promises or async functions.
Schema syntax
Collection definition includes:name: Collection identifierfields: Object containing field definitions, supporting nested objects and arrays
- Type options: Boolean, Integer, Number, String, Date, Dateonly, Timeonly, Binary, Enum, Json, Point, Uuid
defaultValue: Initial value for new recordsenumValues: Possible values for Enum typesisPrimaryKey: Marks primary key fieldsisReadOnly: Read-only designationunique: Uniqueness constraintvalidation: Array of validation rulesreference: Defines foreign key relationships with target collection details
Handling complex data
Flatten mode addresses limitations with nested structures and arrays. Options includeauto or manual modes, similar to Mongoose driver configuration.
When enabled, flatten mode:
- Automatically transforms nested records
- Creates virtual collections for arrays
- Uses
@@@as field separator in flattened output - Generates synthetic IDs and foreign keys for relationships
Write handlers
Implementation requirements
Three optional handlers can be implemented:createRecordHandler, updateRecordHandler, and deleteRecordHandler. Omit any handler for operations not needed.
The createRecordHandler function uniquely supports return values, which proves useful when the target API auto-generates record IDs.
Code example
Node.js
Key takeaways
- All three write handlers remain optional
- Create handlers can return newly generated IDs from the API
- Update and delete handlers perform remote operations without returning values
- The handlers abstract the communication layer between Forest and external APIs
Want to share your custom datasource with the community? Check out the Forest experimental repository to contribute.