外部集成

Backstage 本机支持通过以下方式导入目录数据实体描述符 YAML 文件不过，已经拥有跟踪软件及其 Owner 的现有系统的公司可以通过与 Backstage 集成来利用这些系统。本文介绍了两种常见的集成方法：添加自定义目录实体提供者或添加一个处理器.

背景

目录有一个前端插件部分，它通过服务应用程序接口与后端插件部分进行通信。后端不断从您指定的数据源获取数据，并将其存储到数据库中。有关工作原理的详细信息，请参阅《后端插件》。实体的生命建议先阅读这篇文章。

将数据导入目录的方法主要有两种：制作一个自定义实体提供者或进行定制处理器这两种方案各有优缺点，但通常前者更受青睐。这两种方案将在下文的专门小节中介绍。

自定义实体提供程序

实体提供程序位于目录的最边缘。它们是构成处理树根的实体的原始来源。动态位置存储 API 和您可以在应用程序配置中指定的静态位置就是目录中内置提供程序的两个例子。

实体提供商的一些显著特征：

You instantiate them individually using code in your backend, and pass them to the catalog builder. Often there's one provider instance per remote system. * You may be responsible for actively running them. For example, some providers need to be triggered periodically by a method call to know when they are meant to do their job; in that case you'll have to make that happen. * The timing of their work is entirely detached from the processing loops. One provider may run every 30 seconds, another one on every incoming webhook call of a certain type, etc. * They can perform detailed updates on the set of entities that they are responsible for. They can make full updates of the entire set, or issue individual additions and removals. * Their output is a set of unprocessed entities. Those are then subject to the processing loops before becoming final, stitched entities. * When they remove an entity, the entire subtree of processor-generated entities under that root is eagerly removed as well.

创建实体提供程序

推荐的目录后端类实例化方法是使用CatalogBuilder如图所示这里的后端示例我们将创建一个新的实体提供者子类，可以添加到该目录生成器中。

让我们来制作一个简单的提供程序，它可以根据远程存储刷新一组实体。接口中的提供程序部分其实很小--你只需提供一个(唯一的)名称，并从环境中接受一个连接，通过该连接你就可以发出写入指令。其余的部分则由各个提供程序的实现来完成。

您可以自行决定把这个新的提供程序类的代码放在哪里。为了快速实验，您可以把它放在后端软件包中，但我们建议把所有类似的扩展都放在后端模块包中的plugins文件夹：

yarn new --select backend-module --option id=catalog

课堂的基本结构是这样的

import { UrlReader } from '@backstage/backend-common';
import { Entity } from '@backstage/catalog-model';
import {
  EntityProvider,
  EntityProviderConnection,
} from '@backstage/plugin-catalog-node';

/**
 * Provides entities from fictional frobs service.
 */
export class FrobsProvider implements EntityProvider {
  private readonly env: string;
  private readonly reader: UrlReader;
  private connection?: EntityProviderConnection;

  /** [1] */
  constructor(env: string, reader: UrlReader) {
    this.env = env;
    this.reader = reader;
  }

  /** [2] */
  getProviderName(): string {
    return `frobs-${this.env}`;
  }

  /** [3] */
  async connect(connection: EntityProviderConnection): Promise<void> {
    this.connection = connection;
  }

  /** [4] */
  async run(): Promise<void> {
    if (!this.connection) {
      throw new Error('Not initialized');
    }

    const response = await this.reader.readUrl(
      `https://frobs-${this.env}.example.com/data`,
    );
    const data = JSON.parse(await response.buffer()).toString();

    /** [5] */
    const entities: Entity[] = frobsToEntities(data);

    /** [6] */
    await this.connection.applyMutation({
      type: 'full',
      entities: entities.map(entity => ({
        entity,
        locationKey: `frobs-provider:${this.env}`,
      })),
    });
  }
}

本课演示了几个重要的概念，其中一些是可选的。请看数字标记--让我们逐一进行讲解。

The class takes an env parameter. This is only illustrative for the sake of the example. We'll use this field to exhibit the type of provider where end users may want or need to make multiple instances of the same provider, and what the implications would be in that case. 2. The catalog requires that all registered providers return a name that is unique among those providers, and which is stable over time. The reason for these requirements is, the emitted entities for each provider instance all hang around in a closed bucket of their own. This bucket needs to be tied to their provider over time, and across backend restarts. We'll see below how the processor emits some entities and what that means for its own bucket. 3. Once the catalog engine starts up, it immediately issues the connect call to all known providers. This forms the bond between the code and the database. This is also an opportunity for the provider to do one-time updates on the connection at startup if it wants to. 4. At this point t

提供者突变

让我们回到水桶的比喻。

每个供应商实例- 不是每个类，而是每个在目录中注册的实例--都可以访问自己的实体桶，该桶由提供程序实例的稳定名称标识。提供程序每次发布 "突变 "时，都会更改该桶的内容。该桶之外的任何内容都无法访问。

变异有两种不同类型。

第一个是'full'这意味着要扔掉数据桶中的内容，并用指定的所有新内容取而代之。出于性能考虑，这实际上是通过高效的 delta 机制来实现的，因为从一次运行到另一次运行之间的差异实际上非常小，这种情况很常见。对于可以轻松从远程来源批量获取整个主题材料，并且无法访问或不想计算 deltas 的提供者来说，这种策略非常方便。

另一种变异类型是'delta'这种突变对于基于事件的提供程序等来说非常方便，而且由于无需计算延迟，也无需考虑目标集之外的以前的数据桶内容，因此性能更高。

在所有情况下，突变实体都被视为未处理当这些实体进入数据库后，注册目录处理程序就会对其进行处理，将其转化为经过处理和缝合的最终实体，以备使用。

处理器发出的每个实体都可以有一个locationKey如上图所示，这是一个关键的冲突解决键，采用不透明字符串的形式，对于实体可能位于的每个位置都应是唯一的，如果实体没有固定位置，则该字符串是未定义的。

实际上，如果实体存储在 Git 中，则应将其设置为序列化的位置引用，例如https://github.com/backstage/backstage/blob/master/catalog-info.yaml在我们的示例中，我们将其设置为一个字符串，该字符串对提供者类而言是独特的，加上其实例标识属性(在本例中为env.

当两个实体定义具有相同的实体引用(即种类、命名空间和名称)时，它们之间就会发生冲突。在发生冲突时，比如两个 "相互竞争 "的提供者试图发布具有相同引用三元组的实体时，将根据以下规则使用位置键来解决冲突：

如果实体已经存在于数据库中，但没有设置位置键，新实体将胜出，并覆盖现有实体。如果实体已经存在于数据库中，只有当现有实体和新实体的位置键相同时，新实体才会胜出。如果实体尚未存在，则将实体连同提供的位置键一起插入数据库。

这看似复杂，但却是确保用户不得 "恶意 "接管他人已注册实体的重要机制。

安装提供程序

现在您应该可以在后端中添加该类了packages/backend/src/plugins/catalog.ts:

packages/backend/src/plugins/catalog.ts
import { FrobsProvider } from '../path/to/class';

export default async function createPlugin(
  env: PluginEnvironment,
): Promise<Router> {
  const builder = CatalogBuilder.create(env);
  const frobs = new FrobsProvider('production', env.reader);
  builder.addEntityProvider(frobs);

  const { processingEngine, router } = await builder.build();
  await processingEngine.start();

  await env.scheduler.scheduleTask({
    id: 'run_frobs_refresh',
    fn: async () => {
      await frobs.run();
    },
    frequency: { minutes: 30 },
    timeout: { minutes: 10 },
  });

  // ..
}

请注意，我们使用内置调度程序定期调用run在本例中，它是一个适用于这种特定类型重复任务的驱动程序。我们将调度安排在目录其余部分的实际构建和启动阶段之后，因为此时connect已向提供商发出呼叫。

启动后端--现在它应该开始从之前注册的位置读取数据，你会看到你的实体开始出现在Backstage。

用户实体提供者示例

如果您有第三方实体提供商，如您希望使用的内部人力资源系统，则您不限于使用我们的实体提供商(或只是希望将您自己的数据添加到现有的实体提供商中)。

我们可以创建一个实体提供程序，以读取基于该提供程序的实体。

如上图所示，我们创建了一个基本实体提供程序。在下面的示例中，我们可能想从人力资源系统中提取用户，我假设人力资源系统已经有了 slackUserId，要获取该信息，请参阅Slack Api.

import {
  ANNOTATION_LOCATION,
  ANNOTATION_ORIGIN_LOCATION,
} from '@backstage/catalog-model'
import {
  EntityProvider,
  EntityProviderConnection,
} from '@backstage/plugin-catalog-backend'
import { WebClient } from '@slack/web-api'
import {kebabCase} from 'lodash'

interface Staff {
  displayName: string
  slackUserId: string
  jobTitle: string
  photoUrl: string
  address: string
  email:string
}

export class UserEntityProvider implements EntityProvider {
  private readonly getStaffUrl: string
  protected readonly slackTeam: string
  protected readonly slackToken: string
  protected connection?: EntityProviderConnection

  static fromConfig(config: Config, options: { logger: Logger }) {
    const getStaffUrl = config.getString('staff.url')
    const slackToken = config.getString('slack.token')
    const slackTeam = config.getString('slack.team')
    return new UserEntityProvider({
      ...options,
      getStaffUrl,
      slackToken,
      slackTeam,
    })
  }

  private constructor(options: {
    getStaffUrl: string
    slackToken: string
    slackTeam: string
  }) {
    this.getStaffUrl = options.getStaffUrl
    this.slackToken = options.slackToken
    this.slackTeam = options.slackTeam
  }

  async getAllStaff(): Promise<Staff[]>{
    await return axios.get(this.getStaffUrl)
  }

  public async connect(connection: EntityProviderConnection): Promise<void> {
    this.connection = connection
  }

  async run(): Promise<void> {
    if (!this.connection) {
      throw new Error('User Connection Not initialized')
    }

    const userResources: UserEntity[] = []
    const staff = await this.getAllStaff()

    for (const user of staff) {
      // we can add any links here in this case it would be adding a slack link to the users so you can directly slack them.
      const links =
        user.slackUserId != null && user.slackUserId.length > 0
          ? [
              {
                url: `slack://user?team=${this.slackTeam}&id=${user.slackUserId}`,
                title: 'Slack',
                icon: 'message',
              },
            ]
          : undefined
      const userEntity: UserEntity = {
        kind: 'User',
        apiVersion: 'backstage.io/v1alpha1',
        metadata: {
          annotations: {
            [ANNOTATION_LOCATION]: 'hr-user-https://www.hrurl.com/',
            [ANNOTATION_ORIGIN_LOCATION]: 'hr-user-https://www.hrurl.com/',
          },
          links,
          // name of the entity
          name: kebabCase(user.displayName),
          // name for display purposes could be anything including email
          title: user.displayName,
        },
        spec: {
          profile: {
            displayName: user.displayName,
            email: user.email,
            picture: user.photoUrl,
          },
          memberOf: [],
        },
      }

      userResources.push(userEntity)
    }

    await this.connection.applyMutation({
      type: 'full',
      entities: userResources.map((entity) => ({
        entity,
        locationKey: 'hr-user-https://www.hrurl.com/',
      })),
    })
}

定制处理器

将数据导入目录的另一种可能方式是使用位置读取目录处理器。

处理程序位于目录处理循环的中间。它们负责更新和最终处理未处理的实体，使其成为最终拼接的实体。最重要的是，它们还可以在处理过程中发射其他实体。这些实体随后形成实体树的分支。

处理器的一些显著特征：

它们的调用是由固定的处理循环驱动的。所有处理器都会无条件地重复调用所有实体。除了调整循环的频率外，你无法控制这种行为，因为这同样适用于所有处理器。 * 它们无法详细控制其发射的实体，唯一有效的操作是对其子实体进行上插。如果它们停止发射某个子实体，该子实体就会被标记为孤儿，无法删除。 * 它们的输入是一个未处理的实体，输出是对同一实体的修改，可能还包括一些辅助数据，包括未处理的子实体。

处理器和摄取环路

目录中包含大量注册的位置，这些位置由网站管理员或 Backstage 的个人用户添加。它们的作用是引用某种数据，以便目录不断更新。每个位置都有一个type和一个target都是字符串。

# Example location
type: url
target: https://github.com/backstage/backstage/blob/master/catalog-info.yaml

内置目录后端有一个摄取循环，它会定期浏览所有这些已注册的位置，并将它们及其结果输出推送到以下列表中处理器.

处理器是网站管理员在启动时向目录注册的类。它们是所有目录逻辑的核心，能够读取位置的内容、修改从位置读出的飞行实体、执行验证等。目录自带了一组内置处理器，能够从已知位置类型列表中读取内容，执行基本的处理需求等，但采用Backstage的组织还可以添加更多处理器。

现在，我们将展示创建新处理器和位置类型的过程，它可以从现有的外部 API 中摄取目录数据。

决定新地点

第一步是决定如何指向保存数据的系统。假设该系统内部名为 System-X，可以通过 HTTP REST 调用访问其 API。

让我们决定我们的地点采用以下形式：

type: system-x
target: http://systemx.services.example.net/api/v2

它有自己编造的type和target方便地指向要与之对话的实际 API 端点。

因此，现在我们必须让目录意识到这样一个位置，这样它才能开始将其输入到摄取循环中。对于这种集成，你通常希望将其添加到配置中的静态始终可用位置列表中。

app-config.yaml
catalog:
  locations:
    - type: system-x
      target: http://systemx.services.example.net/api/v2

如果你现在启动后端，它就会开始周期性地说找不到支持该位置的处理器。所以，让我们来做一个这样的处理器吧！

创建目录数据阅读器处理器

推荐的目录后端类实例化方法是使用CatalogBuilder如图所示这里的后端示例我们将创建一个新的目录处理器子类，可以添加到该目录生成器中。

您可以自行决定把这个新处理器类的代码放在哪里。为了快速实验，您可以把它放在后端软件包中，但我们建议把所有类似的扩展都放在后端模块包中的plugins文件夹：

yarn new --select backend-module --option id=catalog

课堂的基本结构是这样的

import { UrlReader } from '@backstage/backend-common';
import {
  processingResult,
  CatalogProcessor,
  CatalogProcessorEmit,
} from '@backstage/plugin-catalog-node';

import { LocationSpec } from '@backstage/plugin-catalog-common';

// A processor that reads from the fictional System-X
export class SystemXReaderProcessor implements CatalogProcessor {
  constructor(private readonly reader: UrlReader) {}

  getProcessorName(): string {
    return 'SystemXReaderProcessor';
  }

  async readLocation(
    location: LocationSpec,
    _optional: boolean,
    emit: CatalogProcessorEmit,
  ): Promise<boolean> {
    // Pick a custom location type string. A location will be
    // registered later with this type.
    if (location.type !== 'system-x') {
      return false;
    }

    try {
      // Use the builtin reader facility to grab data from the
      // API. If you prefer, you can just use plain fetch here
      // (from the node-fetch package), or any other method of
      // your choosing.
      const response = await this.reader.readUrl(location.target);
      const json = JSON.parse((await response.buffer()).toString());
      // Repeatedly call emit(processingResult.entity(location, <entity>))
    } catch (error) {
      const message = `Unable to read ${location.type}, ${error}`;
      emit(processingResult.generalError(location, message));
    }

    return true;
  }
}

需要注意的要点是

制作一个实现 CatalogProcessor 的类 * 只对你关心的位置类型采取行动，其余的不采取行动，返回 false * 以你认为合适的方式从外部系统读取数据。如果你如上所述设计了位置 target 字段，则使用它 * 用该过程的结果调用 emit 任意次数 * 最后返回 `true

现在您应该可以在后端中添加该类了packages/backend/src/plugins/catalog.ts:

packages/backend/src/plugins/catalog.ts
import { SystemXReaderProcessor } from '../path/to/class';

export default async function createPlugin(
  env: PluginEnvironment,
): Promise<Router> {
  const builder = CatalogBuilder.create(env);
  builder.addProcessor(new SystemXReaderProcessor(env.reader));

  // ..
}

启动后端--现在它应该开始从之前注册的位置读取数据，你会看到你的实体开始出现在Backstage。

缓存处理结果

目录会定期刷新目录中的实体，在此过程中，它会调用外部系统来获取更改。这可能会给上游服务造成负担，如果发送的请求过多，大型部署可能会受到速率限制。幸运的是，许多外部系统都提供 ETag 支持来检查更改，这通常不会计入配额，而且可以节省内部和外部资源。

在刷新 GitHub 中的外部位置时，目录已内置了对利用 ET 标签的支持。本示例旨在演示如何为system-x我们之前实施的

import { UrlReader } from '@backstage/backend-common';
import { Entity } from '@backstage/catalog-model';
import {
  processingResult,
  CatalogProcessor,
  CatalogProcessorEmit,
  CatalogProcessorCache,
  CatalogProcessorParser,
  LocationSpec,
} from '@backstage/plugin-catalog-node';

// It's recommended to always bump the CACHE_KEY version if you make
// changes to the processor implementation or CacheItem.
const CACHE_KEY = 'v1';

// Our cache item contains the ETag used in the upstream request
// as well as the processing result used when the Etag matches.
// Bump the CACHE_KEY version if you make any changes to this type.
type CacheItem = {
  etag: string;
  entity: Entity;
};

export class SystemXReaderProcessor implements CatalogProcessor {
  constructor(private readonly reader: UrlReader) {}

  getProcessorName() {
    // The processor name must be unique.
    return 'system-x-processor';
  }

  async readLocation(
    location: LocationSpec,
    _optional: boolean,
    emit: CatalogProcessorEmit,
    _parser: CatalogProcessorParser,
    cache: CatalogProcessorCache,
  ): Promise<boolean> {
    // Pick a custom location type string. A location will be
    // registered later with this type.
    if (location.type !== 'system-x') {
      return false;
    }
    const cacheItem = await cache.get<CacheItem>(CACHE_KEY);
    try {
      // This assumes an URL reader that returns the response together with the ETag.
      // We send the ETag from the previous run if it exists.
      // The previous ETag will be set in the headers for the outgoing request and system-x
      // is going to throw NOT_MODIFIED (HTTP 304) if the ETag matches.
      const response = await this.reader.readUrl(location.target, {
        etag: cacheItem?.etag,
      });
      if (!response) {
        // readUrl is currently optional to implement so we have to check if we get a response back.
        throw new Error(
          'No URL reader that can parse system-x targets installed',
        );
      }

      // ETag is optional in the response but we need it to cache the result.
      if (!response.etag) {
        throw new Error(
          'No ETag returned from system-x, cannot use response for caching',
        );
      }

      // For this example the JSON payload is a single entity.
      const entity: Entity = JSON.parse(response.buffer.toString());
      emit(processingResult.entity(location, entity));

      // Update the cache with the new ETag and entity used for the next run.
      await cache.set<CacheItem>(CACHE_KEY, {
        etag: response.etag,
        entity,
      });
    } catch (error) {
      if (error.name === 'NotModifiedError' && cacheItem) {
        // The ETag matches and we have a cached value from the previous run.
        emit(processingResult.entity(location, cacheItem.entity));
      }
      const message = `Unable to read ${location.type}, ${error}`;
      emit(processingResult.generalError(location, message));
    }

    return true;
  }
}

支持不同的元数据文件格式

有时，您可能已经在 GitHub 或某些提供商中拥有 Backstage 已经支持的文件，但您使用的元数据格式与catalog-info.yaml在这种情况下，您可以实现一个自定义的解析器，它可以读取文件并将它们即时转换为Entity格式，它将无缝集成到目录中，这样您就可以使用诸如GithubEntityProvider来读取这些文件。

您需要做的是提供一个自定义的CatalogProcessorParser并将其提供给builder.setEntityDataParser.

假设我的格式是这样的

id: my-service
type: service
author: [email protected]

我们需要建立一个自定义的解析器，它可以读取这种格式并将其转换为EntityBackstage 所期望的格式。

packages/backend/src/lib/customEntityDataParser.ts
import {
  CatalogProcessorParser,
  CatalogProcessorResult,
  LocationSpec,
  processingResult,
} from '@backstage/plugin-catalog-node';
import yaml from 'yaml';
import {
  Entity,
  stringifyLocationRef,
  ANNOTATION_ORIGIN_LOCATION,
  ANNOTATION_LOCATION,
} from '@backstage/catalog-model';
import _ from 'lodash';
import parseGitUrl from 'git-url-parse';

// This implementation will map whatever your own format is into valid Entity objects.
const makeEntityFromCustomFormatJson = (
  component: { id: string; type: string; author: string },
  location: LocationSpec,
): Entity => {
  return {
    apiVersion: 'backstage.io/v1alpha1',
    kind: 'Component',
    metadata: {
      name: component.id,
      namespace: 'default',
      annotations: {
        [ANNOTATION_LOCATION]: `${location.type}:${location.target}`,
        [ANNOTATION_ORIGIN_LOCATION]: `${location.type}:${location.target}`,
      },
    }
    spec: {
      type: component.type,
      owner: component.author,
      lifecycle: 'experimental'
    }
  }
};

export const customEntityDataParser: CatalogProcessorParser = async function* ({
  data,
  location,
}) {
  let documents: yaml.Document.Parsed[];
  try {
    // let's treat the incoming file always as yaml, you can of course change this if your format is not yaml.
    documents = yaml.parseAllDocuments(data.toString('utf8')).filter(d => d);
  } catch (e) {
    // if we failed to parse as yaml throw some errors.
    const loc = stringifyLocationRef(location);
    const message = `Failed to parse YAML at ${loc}, ${e}`;
    yield processingResult.generalError(location, message);
    return;
  }

  for (const document of documents) {
    // If there's errors parsing the document as yaml, we should throw an error.
    if (document.errors?.length) {
      const loc = stringifyLocationRef(location);
      const message = `YAML error at ${loc}, ${document.errors[0]}`;
      yield processingResult.generalError(location, message);
    } else {
      // Convert the document to JSON
      const json = document.toJSON();
      if (_.isPlainObject(json)) {
        // Is this a catalog-info.yaml file?
        if (json.apiVersion) {
          yield processingResult.entity(location, json as Entity);

        // let's treat this like it's our custom format instead.
        } else {
          yield processingResult.entity(
            location,
            makeEntityFromCustomFormatJson(json, location),
          );
        }
      } else if (json === null) {
        // Ignore null values, these happen if there is an empty document in the
        // YAML file, for example if --- is added to the end of the file.
      } else {
        // We don't support this format.
        const message = `Expected object at root, got ${typeof json}`;
        yield processingResult.generalError(location, message);
      }
    }
  }
}

由于这是一个相当小众的用例，现在需要的代码很多，因此我们目前没有提供太多的辅助工具，让您能够更轻松地提供自定义实现，或将不同的解析器组合在一起。

然后，您应该能够提供以下信息customEntityDataParser到CatalogBuilder:

packages/backend/src/plugins/catalog.ts
import { customEntityDataParser } from '../lib/customEntityDataParser';

...

builder.setEntityDataParser(customEntityDataParser);

背景​

自定义实体提供程序​

创建实体提供程序​

提供者突变​

安装提供程序​

用户实体提供者示例​

定制处理器​

处理器和摄取环路​

决定新地点​

创建目录数据阅读器处理器​

缓存处理结果​

支持不同的元数据文件格式​

背景