# OpenAPI YAML Guide How to add or update endpoint definitions in `openapi.yaml`. The backend **stage branch** serializers are the source of truth for schemas and field names. ## File structure ``` openapi.yaml ├── info: │ └── description — top-level API overview (markdown in single-quoted YAML) ├── paths: — all endpoint definitions, ordered by sidebar position │ └── /public/v1/.../ │ └── post: / get: — method block ├── components: │ └── schemas: — all request/response models ├── tags: — tag definitions (name + optional description) └── x-tagGroups: — sidebar navigation groups (CRITICAL for visibility) ``` ## Adding a new endpoint ### 1. Path entry ```yaml /public/v1/creators/your/route/: post: operationId: public_v1_creators_your_route_create description: 'Single-line summary of what the endpoint does. **What you get** - First bullet describing the primary return value. - Second bullet with additional detail. **Credits** - X credits per successful request. If no data is returned, no credits are deducted.
' summary: Your Endpoint Name tags: - Your Endpoint Name requestBody: content: application/json: schema: $ref: '#/components/schemas/YourInputSchema' application/x-www-form-urlencoded: schema: $ref: '#/components/schemas/YourInputSchema' multipart/form-data: schema: $ref: '#/components/schemas/YourInputSchema' required: true security: - Bearer: [] responses: '200': content: application/json: schema: $ref: '#/components/schemas/YourResponseSchema' description: '' x-credits: X credits per successful request. ``` ### Key rules - **Trailing slash** on all paths: `/public/v1/creators/socials/` not `/public/v1/creators/socials`. - **`description`** is a single-quoted YAML string. Escape literal single quotes by doubling them: `can''t` renders as `can't`. - **Indentation** inside the description: 8 spaces (the content is indented relative to the `description:` key). - **Required sections**: **What you get** and **Credits**. These two must always be present. Additional sections like **When to use**, **Result format**, or **How it works** are optional — add them only when the endpoint needs extra explanation. Each section heading uses `**bold**` markdown. - **`summary`** must match the tag name exactly — it becomes the sidebar label. - **Multiple endpoints per tag**: Several endpoints can share one tag (e.g. all batch endpoints use `Batch enrichment`). They appear as separate operations on the same page. - **`operationId`** convention: `public_v1_{path_segments_with_underscores}_{create|list|retrieve}` — `create` for POST, `list` for GET (collection), `retrieve` for GET (single resource). - **Three content types** in `requestBody` for standard POST endpoints: `application/json`, `application/x-www-form-urlencoded`, `multipart/form-data` — all point to the same schema `$ref`. **File upload endpoints** (like batch create) use only `multipart/form-data`. - **`security`**: always `- Bearer: []`. - **`x-credits`** is a plain-text string placed after `responses`. - The `` placeholder must be the **last thing** before the closing `'`. The `data-endpoint` value is a kebab-case key that matches the `ENDPOINTS` array in `copy-for-ai.js`. ### GET endpoints For GET endpoints, use `parameters:` instead of `requestBody:`. Query params use `in: query`, path params use `in: path`: ```yaml parameters: - in: path name: batch_id schema: type: string required: true description: The batch job identifier - in: query name: format schema: type: string enum: [csv, json] default: csv required: false ``` ### Error responses Most endpoints only define `'200'`. For endpoints with documented error cases, add error responses: ```yaml responses: '200': content: application/json: schema: $ref: '#/components/schemas/YourResponseSchema' description: '' '400': content: application/json: schema: type: object additionalProperties: {} description: Bad Request - Invalid input '403': content: application/json: schema: type: object additionalProperties: {} description: Forbidden - Insufficient permissions '429': content: application/json: schema: type: object additionalProperties: {} description: Too Many Requests - Rate limit exceeded ``` ## Schema conventions ### Match the backend serializer exactly | Backend (Django REST Framework) | OpenAPI 3.0.3 | | --- | --- | | `serializers.CharField()` | `type: string` | | `serializers.CharField(allow_blank=True)` | `type: string` (no special mapping) | | `serializers.IntegerField()` | `type: integer` | | `serializers.FloatField()` | `type: number` | | `serializers.BooleanField()` | `type: boolean` | | `serializers.DictField()` | `type: object` | | `serializers.ListField(child=CharField())` | `type: array` + `items: { type: string }` | | `serializers.ChoiceField(choices=[...])` | `type: string` + `enum: [...]` | | `allow_null=True` | Add `nullable: true` to the property | | `required=False` | Do NOT include in the `required:` array | | `required=True` (or field has no `required=False`) | Include in the `required:` array | | `default=value` | Add `default: value` to the property | | `min_length=N` on a `ListField` | `minItems: N` | | `max_length=N` on a `ListField` | `maxItems: N` | | `help_text="..."` | Add `description: '...'` to the property | | Nested `SomeSerializer()` | `$ref: '#/components/schemas/Some'` | | Nested `SomeSerializer(many=True)` | `type: array` + `items: { $ref: '#/components/schemas/Some' }` | | Nested `SomeSerializer(required=False, allow_null=True)` | `$ref` with `nullable: true` | ### Pydantic models Some backend code uses Pydantic `BaseModel` instead of DRF serializers (e.g. `SocialProfileFields`). The mapping is similar: | Pydantic | OpenAPI | | --- | --- | | `Field(default=None)` | `nullable: true`, not in `required` | | `Optional[str]` | `type: string`, `nullable: true` | | `Optional[int]` | `type: integer`, `nullable: true` | | `Optional[bool]` | `type: boolean`, `nullable: true` | | `Optional[list[str]]` | `type: array` + `items: { type: string }`, `nullable: true` | | `validation_alias=AliasChoices(...)` | Use the **first** alias as the field name (that's the canonical name) | ### Field names Use the **exact** field names from the backend serializer. If the serializer uses `snake_case`, the schema uses `snake_case`. If it uses `camelCase` (e.g. `userId`), the schema uses `camelCase`. **Do not rename fields.** ### Identifier field descriptions Any field that acts as an identifier (`handle`, `creators`, `email`, `post_id`, etc.) **must** have a `description` explaining exactly what values are accepted. Check the backend view to see how the field is processed. Example — `handle` across enrichment and content endpoints accepts: ```yaml handle: type: string description: 'Creator identifier — username, profile URL, or YouTube channel ID (UC...).' ``` Never leave identifier fields with just `type: string` and no description. The API consumer needs to know whether to pass a username, a URL, an ID, etc. ### Response wrapper pattern Most endpoints wrap their response in a standard envelope: ```python final_resp = { "result": ..., # the actual data "credits_cost": ..., # Decimal "response_meta": ..., # dict with scraper_used, filter_key } ``` The response schema should include `credits_cost` (type: number, nullable: true) and `result` (the `$ref` to the actual data schema). `response_meta` is internal and does not need to be documented. ### Result schemas — use named, expanded models Every endpoint should have a dedicated **named Result schema** (e.g. `CreatorPostsResult`, `AudienceOverlapResponse`) rather than inlining the response structure. Break nested objects out into their own schemas so that Redocly renders them as expandable, clearly documented sections. **Good** — each nested object is its own schema: ```yaml CreatorPostsResult: type: object properties: posts: type: array items: $ref: '#/components/schemas/PostItemModel' next_cursor: type: string nullable: true PostItemModel: type: object properties: id: type: string text: type: string nullable: true engagement: $ref: '#/components/schemas/EngagementModel' user: $ref: '#/components/schemas/UserInfoModel' EngagementModel: type: object properties: likes: type: integer nullable: true comments: type: integer nullable: true ``` **Bad** — everything inlined in one flat schema: ```yaml CreatorPostsResult: type: object properties: posts: type: array items: type: object properties: id: type: string likes: type: integer comments: type: integer ``` Named sub-schemas make the docs easier to read and keep the response panel expandable. Follow the backend serializer structure — if the backend has a nested serializer, create a corresponding named schema. ### Parameter descriptions and examples (future iteration) > **Note**: This is planned for a future iteration of `openapi.yaml`. When updating or adding endpoints, start applying this pattern where practical. Every request parameter should include: - A `description` explaining what the field does and what values are accepted. - An `example` value showing a realistic input. ```yaml handle: type: string description: 'Creator identifier — username, profile URL, or YouTube channel ID (UC...).' example: 'cristiano' platform: type: string enum: - instagram - tiktok - youtube description: 'Social platform to query.' example: 'instagram' ``` This applies to all fields, not just identifiers. The goal is that a developer reading the docs can understand every field without needing to look at external resources. ### Example: serializer to schema Backend serializer: ```python class AudienceOverlapInputSerializer(serializers.Serializer): overlap_platforms = [ ("instagram", "instagram"), ("tiktok", "tiktok"), ("youtube", "youtube"), ] platform = serializers.ChoiceField(choices=overlap_platforms) creators = serializers.ListField( child=serializers.CharField(), min_length=2, max_length=10, ) ``` OpenAPI schema: ```yaml AudienceOverlapInput: type: object properties: platform: type: string enum: - instagram - tiktok - youtube creators: type: array items: type: string minItems: 2 maxItems: 10 required: - platform - creators ``` ### Example: nested serializer Backend: ```python class OverlapDataItemSerializer(serializers.Serializer): user = OverlapUserSerializer() user_id = serializers.CharField() username = serializers.CharField() followers = serializers.IntegerField() unique_percentage = serializers.FloatField() overlapping_percentage = serializers.FloatField() ``` OpenAPI: ```yaml OverlapDataItem: type: object properties: user: $ref: '#/components/schemas/OverlapUser' user_id: type: string username: type: string followers: type: integer unique_percentage: type: number overlapping_percentage: type: number required: - user - user_id - username - followers - unique_percentage - overlapping_percentage ``` ## tags section Located near the end of the file, before `x-tagGroups`. Each endpoint's tag must be listed here: ```yaml tags: - name: Discovery API - name: Similar Creators - name: Audience Overlap - name: Dictionary description: | Provides the complete set of supported filter values... - name: Connected Socials - name: Enrich by handle full ... ``` Add `description:` only when the tag name alone isn't self-explanatory (e.g. Dictionary, Batch enrichment). ## x-tagGroups (sidebar navigation) Located at the very bottom of the file. **Every tag must be listed here or it will not appear in the sidebar.** ```yaml x-tagGroups: - name: Creator Discovery tags: - Discovery API - Similar Creators - Audience Overlap - Dictionary - name: Creator Enrichment tags: - Connected Socials - Enrich by handle full - Enrich by handle raw - Enrich by email advanced - Enrich by email basic - Batch enrichment - name: Creator Content tags: - Creator Posts - Post Details - name: User Info tags: - Account credits & usage ``` To add a new endpoint: add its tag under the correct existing group. To create a new group: add a new `- name:` block (but check with the team first — don't invent groups). ## Common mistakes 1. **Endpoint doesn't show in sidebar** — tag is missing from `x-tagGroups`. 2. **Tag missing from `tags:` section** — even if it's in `x-tagGroups`, the tag must also be declared in the `tags:` list near the end of the file. 3. **Schema field mismatch** — field name in openapi.yaml doesn't match backend serializer (e.g. `camelCase` vs `snake_case`, or `handle` vs `username`). 4. **Invented constraints** — do not add `enum`, `minItems`, `maxItems`, `description`, or any other constraint that is not in the backend serializer. The backend is the **only** source of truth. 5. **Missing content types** — `requestBody` must include all three: `application/json`, `application/x-www-form-urlencoded`, `multipart/form-data`. 6. **Broken YAML string** — unescaped single quote inside a single-quoted description. Use `''` (doubled) to escape. 7. **Missing trailing slash** — paths must end with `/` to match the Django URL config. 8. **Wrong operationId** — must follow the `public_v1_{path}_{method}` convention. Django generates these automatically; copy from the backend's generated schema if available. 9. **Forgetting the AI prompt div** — the `` must be present at the end of every endpoint description for the prompt builder card to render. 10. **Nullable without allow_null** — only add `nullable: true` if the backend serializer explicitly has `allow_null=True`. Don't add it by default.