UCD.js Docs

Pipeline API

Reference for pipeline authoring, execution, presets, and CLI entrypoints

Pipeline API Reference

This page is a compact reference for the pipeline authoring surface that exists today.

Core authoring functions

PackageAPIPurpose
@ucdjs/pipeline-coredefinePipeline(...)Create a pipeline definition with sources, routes, versions, and execution options
@ucdjs/pipeline-coredefinePipelineRoute(...)Create a route with a filter, parser, optional transforms, resolver, dependencies, and outputs
@ucdjs/pipeline-coredefinePipelineSource(...)Create a source definition around a backend
@ucdjs/pipeline-coredefinePipelineTransform(...)Create a reusable transform that runs between parser and resolver
@ucdjs/pipeline-corepipelineOutputSource(...)Consume published outputs from another pipeline
@ucdjs/pipeline-corefilesystemSink(...)Persist route outputs to the filesystem

Pipeline hooks

Pipeline definitions can include lifecycle hooks:

import { definePipeline } from "@ucdjs/pipeline-core";

export const pipeline = definePipeline({
  id: "with-hooks",
  name: "With Hooks",
  versions: ["1.0.0"],
  inputs: [source],
  routes: [route],
  hooks: {
    resolve(ctx) {
      if (ctx.phase === "end" && ctx.error == null) {
        ctx.logger.info("Resolved route", {
          routeId: ctx.routeId,
          outputs: ctx.outputs?.length ?? 0,
        });
      }
    },
  },
});
HookContext highlights
pipeline(ctx)phase, pipelineId, logger, error on failed end
version(ctx)pipeline fields plus version
route(ctx)version fields plus file, routeId, outputs, error
parse(ctx)route fields plus rowCount and filteredRowCount on end
resolve(ctx)route fields plus outputs or error on end
output(ctx)route fields plus outputIndex, outputId, property, sink, locator, and status on end

Every hook context includes phase: "start" | "end". Hooks may be async and are intended for side effects. If a hook throws, the executor logs the hook failure and continues; hook failures do not change the pipeline result.

Source helpers

ImportPurpose
createMemorySource from @ucdjs/pipeline-core/sourcesBest option for tests, examples, and small local fixtures
createHttpSource from @ucdjs/pipeline-core/sourcesHTTP-backed source helper for custom backends and controlled file enumeration
createUnicodeOrgSource from @ucdjs/pipeline-core/sourcesPreconfigured HTTP helper targeting https://www.unicode.org/Public/

Filter helpers

Common helpers from @ucdjs/pipeline-core:

  • byName(...)
  • byDir(...)
  • byExt(...)
  • byGlob(...)
  • byPath(...)
  • byProp(...)
  • bySource(...)
  • and(...)
  • or(...)
  • not(...)
  • always()
  • never()

See /pipelines/filters for examples and guidance on when to use source-level versus route-level filtering.

Built-in transforms

Common helpers from @ucdjs/pipeline-core/transforms:

  • createSortTransform(...)
  • createDeduplicateTransform(...)
  • createExpandRangesTransform(...)
  • createNormalizeTransform(...)
  • sortByCodePoint
  • deduplicateRows
  • expandRanges
  • normalizeCodePoints

Presets

Useful exports from @ucdjs/pipeline-presets:

CategoryExamples
ParsersstandardParser, sequenceParser, unicodeDataParser, createStandardParser(...)
ResolverspropertyJsonResolver, createPropertyJsonResolver(...), createGroupedResolver(...)
RoutesunicodeDataRoute, blocksRoute, scriptsRoute, emojiDataRoute, allRoutes
Pipeline factoriescreateBasicPipeline(...), createFullPipeline(...), createEmojiPipeline(...)

Route runtime context

Inside a route you will commonly use:

  • ctx.logger for structured logs
  • ctx.getRouteData("route-id") to read upstream route data declared through depends
  • ctx.now() for timestamps
  • ctx.normalizeEntries(...) when normalizing resolved entries

Inside filters and transforms you also receive the logger, file metadata, version, and source metadata where relevant.

Execution API

Programmatic execution lives in @ucdjs/pipeline-executor:

import { createPipelineExecutor } from "@ucdjs/pipeline-executor";

const executor = createPipelineExecutor({});
const results = await executor.run([pipeline], {
  cache: true,
  versions: ["1.0.0"],
});

Execution options and hooks:

  • cacheStore to enable route result caching
  • onLog(entry) to receive incremental logs
  • onTrace(trace) to receive execution traces
  • runtime to provide environment-specific logging and output capture

CLI entrypoints

The CLI reference lives at /packages/cli/pipelines. The commands most relevant to authors are:

  • ucd pipelines list
  • ucd pipelines run
  • ucd pipelines cache status
  • ucd pipelines cache refresh
  • ucd pipelines cache clear

Pipeline discovery uses the file pattern **/*.ucd-pipeline.ts.

Run the CLI from the repo root during development:

./packages/cli/bin/ucd.js pipelines list --cwd packages/pipelines/pipeline-playground

On this page