# OpenAPI YAML Guide

How to add or update endpoint definitions in `openapi.yaml`.
The backend **stage branch** serializers are the source of truth for schemas and field names.

## File structure


```
openapi.yaml
├── info:
│   └── description         — top-level API overview (markdown in single-quoted YAML)
├── paths:                   — all endpoint definitions, ordered by sidebar position
│   └── /public/v1/.../
│       └── post: / get:     — method block
├── components:
│   └── schemas:             — all request/response models
├── tags:                    — tag definitions (name + optional description)
└── x-tagGroups:             — sidebar navigation groups (CRITICAL for visibility)
```

## Adding a new endpoint

### 1. Path entry


```yaml
  /public/v1/creators/your/route/:
    post:
      operationId: public_v1_creators_your_route_create
      description: 'Single-line summary of what the endpoint does.


        **What you get**

        - First bullet describing the primary return value.

        - Second bullet with additional detail.


        **Credits**

        - X credits per successful request. If no data is returned, no credits
        are deducted.


        <div class="ic-ai-prompt-root" data-endpoint="your-endpoint-key"></div>'
      summary: Your Endpoint Name
      tags:
      - Your Endpoint Name
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/YourInputSchema'
          application/x-www-form-urlencoded:
            schema:
              $ref: '#/components/schemas/YourInputSchema'
          multipart/form-data:
            schema:
              $ref: '#/components/schemas/YourInputSchema'
        required: true
      security:
      - Bearer: []
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/YourResponseSchema'
          description: ''
      x-credits: X credits per successful request.
```

### Key rules

- **Trailing slash** on all paths: `/public/v1/creators/socials/` not `/public/v1/creators/socials`.
- **`description`** is a single-quoted YAML string. Escape literal single quotes by doubling them: `can''t` renders as `can't`.
- **Indentation** inside the description: 8 spaces (the content is indented relative to the `description:` key).
- **Required sections**: **What you get** and **Credits**. These two must always be present. Additional sections like **When to use**, **Result format**, or **How it works** are optional — add them only when the endpoint needs extra explanation. Each section heading uses `**bold**` markdown.
- **`summary`** must match the tag name exactly — it becomes the sidebar label.
- **Multiple endpoints per tag**: Several endpoints can share one tag (e.g. all batch endpoints use `Batch enrichment`). They appear as separate operations on the same page.
- **`operationId`** convention: `public_v1_{path_segments_with_underscores}_{create|list|retrieve}` — `create` for POST, `list` for GET (collection), `retrieve` for GET (single resource).
- **Three content types** in `requestBody` for standard POST endpoints: `application/json`, `application/x-www-form-urlencoded`, `multipart/form-data` — all point to the same schema `$ref`. **File upload endpoints** (like batch create) use only `multipart/form-data`.
- **`security`**: always `- Bearer: []`.
- **`x-credits`** is a plain-text string placed after `responses`.
- The `<div class="ic-ai-prompt-root" data-endpoint="..."></div>` placeholder must be the **last thing** before the closing `'`. The `data-endpoint` value is a kebab-case key that matches the `ENDPOINTS` array in `copy-for-ai.js`.


### GET endpoints

For GET endpoints, use `parameters:` instead of `requestBody:`. Query params use `in: query`, path params use `in: path`:


```yaml
      parameters:
      - in: path
        name: batch_id
        schema:
          type: string
        required: true
        description: The batch job identifier
      - in: query
        name: format
        schema:
          type: string
          enum: [csv, json]
          default: csv
        required: false
```

### Error responses

Most endpoints only define `'200'`. For endpoints with documented error cases, add error responses:


```yaml
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/YourResponseSchema'
          description: ''
        '400':
          content:
            application/json:
              schema:
                type: object
                additionalProperties: {}
          description: Bad Request - Invalid input
        '403':
          content:
            application/json:
              schema:
                type: object
                additionalProperties: {}
          description: Forbidden - Insufficient permissions
        '429':
          content:
            application/json:
              schema:
                type: object
                additionalProperties: {}
          description: Too Many Requests - Rate limit exceeded
```

## Schema conventions

### Match the backend serializer exactly

| Backend (Django REST Framework) | OpenAPI 3.0.3 |
|  --- | --- |
| `serializers.CharField()` | `type: string` |
| `serializers.CharField(allow_blank=True)` | `type: string` (no special mapping) |
| `serializers.IntegerField()` | `type: integer` |
| `serializers.FloatField()` | `type: number` |
| `serializers.BooleanField()` | `type: boolean` |
| `serializers.DictField()` | `type: object` |
| `serializers.ListField(child=CharField())` | `type: array` + `items: { type: string }` |
| `serializers.ChoiceField(choices=[...])` | `type: string` + `enum: [...]` |
| `allow_null=True` | Add `nullable: true` to the property |
| `required=False` | Do NOT include in the `required:` array |
| `required=True` (or field has no `required=False`) | Include in the `required:` array |
| `default=value` | Add `default: value` to the property |
| `min_length=N` on a `ListField` | `minItems: N` |
| `max_length=N` on a `ListField` | `maxItems: N` |
| `help_text="..."` | Add `description: '...'` to the property |
| Nested `SomeSerializer()` | `$ref: '#/components/schemas/Some'` |
| Nested `SomeSerializer(many=True)` | `type: array` + `items: { $ref: '#/components/schemas/Some' }` |
| Nested `SomeSerializer(required=False, allow_null=True)` | `$ref` with `nullable: true` |


### Pydantic models

Some backend code uses Pydantic `BaseModel` instead of DRF serializers (e.g. `SocialProfileFields`). The mapping is similar:

| Pydantic | OpenAPI |
|  --- | --- |
| `Field(default=None)` | `nullable: true`, not in `required` |
| `Optional[str]` | `type: string`, `nullable: true` |
| `Optional[int]` | `type: integer`, `nullable: true` |
| `Optional[bool]` | `type: boolean`, `nullable: true` |
| `Optional[list[str]]` | `type: array` + `items: { type: string }`, `nullable: true` |
| `validation_alias=AliasChoices(...)` | Use the **first** alias as the field name (that's the canonical name) |


### Field names

Use the **exact** field names from the backend serializer. If the serializer uses `snake_case`, the schema uses `snake_case`. If it uses `camelCase` (e.g. `userId`), the schema uses `camelCase`. **Do not rename fields.**

### Identifier field descriptions

Any field that acts as an identifier (`handle`, `creators`, `email`, `post_id`, etc.) **must** have a `description` explaining exactly what values are accepted. Check the backend view to see how the field is processed.

Example — `handle` across enrichment and content endpoints accepts:


```yaml
        handle:
          type: string
          description: 'Creator identifier — username, profile URL, or YouTube channel ID (UC...).'
```

Never leave identifier fields with just `type: string` and no description. The API consumer needs to know whether to pass a username, a URL, an ID, etc.

### Response wrapper pattern

Most endpoints wrap their response in a standard envelope:


```python
final_resp = {
    "result": ...,          # the actual data
    "credits_cost": ...,    # Decimal
    "response_meta": ...,   # dict with scraper_used, filter_key
}
```

The response schema should include `credits_cost` (type: number, nullable: true) and `result` (the `$ref` to the actual data schema). `response_meta` is internal and does not need to be documented.

### Result schemas — use named, expanded models

Every endpoint should have a dedicated **named Result schema** (e.g. `CreatorPostsResult`, `AudienceOverlapResponse`) rather than inlining the response structure. Break nested objects out into their own schemas so that Redocly renders them as expandable, clearly documented sections.

**Good** — each nested object is its own schema:


```yaml
    CreatorPostsResult:
      type: object
      properties:
        posts:
          type: array
          items:
            $ref: '#/components/schemas/PostItemModel'
        next_cursor:
          type: string
          nullable: true

    PostItemModel:
      type: object
      properties:
        id:
          type: string
        text:
          type: string
          nullable: true
        engagement:
          $ref: '#/components/schemas/EngagementModel'
        user:
          $ref: '#/components/schemas/UserInfoModel'

    EngagementModel:
      type: object
      properties:
        likes:
          type: integer
          nullable: true
        comments:
          type: integer
          nullable: true
```

**Bad** — everything inlined in one flat schema:


```yaml
    CreatorPostsResult:
      type: object
      properties:
        posts:
          type: array
          items:
            type: object
            properties:
              id:
                type: string
              likes:
                type: integer
              comments:
                type: integer
```

Named sub-schemas make the docs easier to read and keep the response panel expandable. Follow the backend serializer structure — if the backend has a nested serializer, create a corresponding named schema.

### Parameter descriptions and examples (future iteration)

> **Note**: This is planned for a future iteration of `openapi.yaml`. When updating or adding endpoints, start applying this pattern where practical.


Every request parameter should include:

- A `description` explaining what the field does and what values are accepted.
- An `example` value showing a realistic input.


```yaml
        handle:
          type: string
          description: 'Creator identifier — username, profile URL, or YouTube channel ID (UC...).'
          example: 'cristiano'
        platform:
          type: string
          enum:
          - instagram
          - tiktok
          - youtube
          description: 'Social platform to query.'
          example: 'instagram'
```

This applies to all fields, not just identifiers. The goal is that a developer reading the docs can understand every field without needing to look at external resources.

### Example: serializer to schema

Backend serializer:


```python
class AudienceOverlapInputSerializer(serializers.Serializer):
    overlap_platforms = [
        ("instagram", "instagram"),
        ("tiktok", "tiktok"),
        ("youtube", "youtube"),
    ]
    platform = serializers.ChoiceField(choices=overlap_platforms)
    creators = serializers.ListField(
        child=serializers.CharField(),
        min_length=2,
        max_length=10,
    )
```

OpenAPI schema:


```yaml
    AudienceOverlapInput:
      type: object
      properties:
        platform:
          type: string
          enum:
          - instagram
          - tiktok
          - youtube
        creators:
          type: array
          items:
            type: string
          minItems: 2
          maxItems: 10
      required:
      - platform
      - creators
```

### Example: nested serializer

Backend:


```python
class OverlapDataItemSerializer(serializers.Serializer):
    user = OverlapUserSerializer()
    user_id = serializers.CharField()
    username = serializers.CharField()
    followers = serializers.IntegerField()
    unique_percentage = serializers.FloatField()
    overlapping_percentage = serializers.FloatField()
```

OpenAPI:


```yaml
    OverlapDataItem:
      type: object
      properties:
        user:
          $ref: '#/components/schemas/OverlapUser'
        user_id:
          type: string
        username:
          type: string
        followers:
          type: integer
        unique_percentage:
          type: number
        overlapping_percentage:
          type: number
      required:
      - user
      - user_id
      - username
      - followers
      - unique_percentage
      - overlapping_percentage
```

## tags section

Located near the end of the file, before `x-tagGroups`. Each endpoint's tag must be listed here:


```yaml
tags:
- name: Discovery API
- name: Similar Creators
- name: Audience Overlap
- name: Dictionary
  description: |
    Provides the complete set of supported filter values...
- name: Connected Socials
- name: Enrich by handle full
...
```

Add `description:` only when the tag name alone isn't self-explanatory (e.g. Dictionary, Batch enrichment).

## x-tagGroups (sidebar navigation)

Located at the very bottom of the file. **Every tag must be listed here or it will not appear in the sidebar.**


```yaml
x-tagGroups:
- name: Creator Discovery
  tags:
  - Discovery API
  - Similar Creators
  - Audience Overlap
  - Dictionary
- name: Creator Enrichment
  tags:
  - Connected Socials
  - Enrich by handle full
  - Enrich by handle raw
  - Enrich by email advanced
  - Enrich by email basic
  - Batch enrichment
- name: Creator Content
  tags:
  - Creator Posts
  - Post Details
- name: User Info
  tags:
  - Account credits & usage
```

To add a new endpoint: add its tag under the correct existing group. To create a new group: add a new `- name:` block (but check with the team first — don't invent groups).

## Common mistakes

1. **Endpoint doesn't show in sidebar** — tag is missing from `x-tagGroups`.
2. **Tag missing from `tags:` section** — even if it's in `x-tagGroups`, the tag must also be declared in the `tags:` list near the end of the file.
3. **Schema field mismatch** — field name in openapi.yaml doesn't match backend serializer (e.g. `camelCase` vs `snake_case`, or `handle` vs `username`).
4. **Invented constraints** — do not add `enum`, `minItems`, `maxItems`, `description`, or any other constraint that is not in the backend serializer. The backend is the **only** source of truth.
5. **Missing content types** — `requestBody` must include all three: `application/json`, `application/x-www-form-urlencoded`, `multipart/form-data`.
6. **Broken YAML string** — unescaped single quote inside a single-quoted description. Use `''` (doubled) to escape.
7. **Missing trailing slash** — paths must end with `/` to match the Django URL config.
8. **Wrong operationId** — must follow the `public_v1_{path}_{method}` convention. Django generates these automatically; copy from the backend's generated schema if available.
9. **Forgetting the AI prompt div** — the `<div class="ic-ai-prompt-root" data-endpoint="..."></div>` must be present at the end of every endpoint description for the prompt builder card to render.
10. **Nullable without allow_null** — only add `nullable: true` if the backend serializer explicitly has `allow_null=True`. Don't add it by default.