0023. Deterministic Alembic Schema And Identifier Naming
Status
Completed
Context
ADR 0022 moved ms-markets to the SDK-managed Alembic migration workflow. The
first generated revision files exposed a bad autogenerate shape:
op.drop_constraint(
op.f("ms_markets__assetcurrentpricingdetails__mainsequ_asset_uid_fkey"),
"ms_markets__assetcurrentpricingdetails__mainsequence_examples",
type_="foreignkey",
)
op.create_foreign_key(
None,
"ms_markets__assetcurrentpricingdetails__mainsequence_examples",
"ms_markets__asset__mainsequence_examples",
["asset_uid"],
["uid"],
source_schema="public",
referent_schema="public",
ondelete="CASCADE",
)
The database already has the reflected foreign-key name. The churn is not caused by a missing backend constraint. It is caused by Alembic comparing schema-qualified SQLAlchemy metadata against reflected default-schema database objects and by unstable Python-side constraint names.
Alembic foreign-key comparison includes source schema, source table, source
columns, target schema, target table, target columns, and FK options.
PostgreSQL's public schema is the default schema. In autogenerate,
default-schema reflected objects are represented as schema=None. Therefore
authoring package metadata with explicit schema="public" makes Alembic see a
false difference:
reflected FK: schema=None
metadata FK: schema="public"
That false None versus public difference is enough for Alembic to emit FK
drop/create churn even when the FK names and options are identical.
The current metadata also lacks deterministic FK naming:
class MarketsBase(DeclarativeBase):
metadata = MetaData()
Model declarations use natural SQLAlchemy ForeignKey(...) declarations
without name=..., so model-side FK names are None. Once Alembic decides a
foreign key differs, it renders op.create_foreign_key(None, ...) and can
produce invalid downgrade operations such as
op.drop_constraint(None, ..., type_="foreignkey").
Indexes have a separate naming problem. Many models declare Index(None, ...).
SQLAlchemy's default index naming is based on the first indexed column. That
creates duplicate generated names when a table has both a single-column index
and a composite index that starts with the same column. The namespace suffix can
also push generated index names over PostgreSQL's identifier limit. PostgreSQL
visible identifiers are limited to POSTGRES_IDENTIFIER_MAX_LENGTH characters
in the SDK, currently 63. Names longer than that are reflected back as truncated
database names, which causes more Alembic drop/create noise.
Nothing is deployed yet for this migration stream, so bad generated revisions can be discarded after the comparison contract is fixed.
Decision
ms-markets will make Alembic metadata deterministic before accepting any
source revision files.
The fix has two parts:
- Make Alembic schema reflection match the authored metadata.
- Centralize all physical table, foreign-key, primary-key, unique-constraint, check-constraint, and index naming in one package-owned naming module.
Default Schema Normalization
ms-markets must treat PostgreSQL public as the default schema in authored
SQLAlchemy metadata. Default-schema package tables and the package Alembic
version table are authored as schema=None, not schema="public".
This means:
MARKETS_SCHEMAisNone;MARKETS_DEFAULT_SCHEMArecords the database default namepublic;markets_table_args(...)omits the SQLAlchemyschematable option when the requested schema isNone, empty, orpublic;MarketsAlembicVersion.__alembic_version_schema__isNone;- the SDK
PlatformManagedMetaTable.__table_cls__default-schema behavior is overridden in thems-marketsmixins so it cannot forcepublicinto the SQLAlchemyTablemetadata; - explicit named schemas remain supported, but only non-default schemas are authored as schema-qualified metadata.
src/migrations/env.py must not force schema-aware comparison for default
schema metadata. It should enable include_schemas only when the provider
metadata or version table actually uses a non-default named schema.
The Alembic environment shape is:
from msm.base import MARKETS_DEFAULT_SCHEMA, MARKETS_SCHEMA
def _included_schema(name: str | None) -> bool:
if MARKETS_SCHEMA is None:
return name in (None, MARKETS_DEFAULT_SCHEMA)
return name == MARKETS_SCHEMA
def _uses_named_schemas() -> bool:
migration = _migration_provider()
if migration.version_table_schema is not None:
return True
return any(table.schema is not None for table in migration.target_metadata.tables.values())
def include_name(name, type_, parent_names):
if type_ == "schema":
return _included_schema(name)
schema_name = parent_names.get("schema_name") if parent_names else None
if not _included_schema(schema_name):
return False
return _migration_provider().include_name(name, type_, parent_names)
def include_object(object_, name, type_, reflected, compare_to):
object_schema = getattr(object_, "schema", None)
if not _included_schema(object_schema):
return False
return _migration_provider().include_object(
object_,
name,
type_,
reflected,
compare_to,
)
def _configure_kwargs():
migration = _migration_provider()
return {
"target_metadata": migration.target_metadata,
"version_table": migration.version_table,
"version_table_schema": migration.version_table_schema,
"include_schemas": _uses_named_schemas(),
"include_name": include_name,
"include_object": include_object,
"compare_type": True,
"compare_server_default": True,
}
The important behavior is that default-schema metadata and default-schema
reflection both compare as None. The provider still owns table scope.
SDK Naming Helpers
Deterministic physical-name helpers are SDK-owned and imported from
mainsequence.meta_tables. ms-markets must not carry a duplicate
msm.schema_names module. Package-specific code such as msm.base supplies
the app prefix, namespace suffix, and any compatibility aliases.
The SDK helpers provide this intent:
normalize_identifier_part(value: str, *, field_name: str = "identifier part") -> str
bounded_identifier(*parts: str, max_length: int = POSTGRES_IDENTIFIER_MAX_LENGTH) -> str
schema_table_name(app: str, concept: str, suffix: str | None = None) -> str
parse_schema_table_name(table_name: str) -> SchemaTableNameParts
schema_index_name(table_name: str, columns: Sequence[str], *, unique: bool = False) -> str
schema_foreign_key_name(
table_name: str,
columns: Sequence[str],
target_table: str,
target_columns: Sequence[str] = (),
) -> str
schema_primary_key_name(table_name: str) -> str
schema_unique_constraint_name(table_name: str, columns: Sequence[str]) -> str
schema_check_constraint_name(table_name: str, constraint_name: str | None = None) -> str
sqlalchemy_naming_convention() -> dict[str, Any]
The exact function names can change during implementation, but the behavior must be fixed:
- normalize dots, dashes, whitespace, uppercase letters, and other separators the same way for tables, indexes, and constraints;
- support an
app__concepttable convention where the package supplies the app name, for examplems_markets; - append optional package namespace suffixes only through the table-name helper;
- keep every returned identifier within
POSTGRES_IDENTIFIER_MAX_LENGTH; - include a short deterministic digest when a name must be shortened;
- compute the digest from the full untruncated semantic payload;
- include all indexed or constrained columns in index/constraint identity, not only the first column;
- keep generated names stable across processes and Python versions;
- avoid handwritten per-model FK/index names unless a model has a genuine exceptional requirement.
SQLAlchemy Naming Convention
MarketsBase.metadata must use a SQLAlchemy naming convention that delegates to
the naming module through custom naming tokens. The convention should cover at
least:
pk
fk
ix
uq
ck
Model code should continue to use natural SQLAlchemy declarations:
ForeignKey(f"{AssetTable.__table__.fullname}.uid", ondelete="CASCADE")
Index(None, "account_uid", "time_index", unique=True)
The metadata naming convention, not each model declaration, should assign the physical FK/index names. This keeps models readable while giving Alembic stable constraint identity.
Revision Policy
Generated revision files under:
src/migrations/versions/mainsequence_examples/
must not be accepted when they contain unrelated FK/index churn. This ADR fixes the metadata and Alembic comparison contract so future SDK CLI revision output can be reviewed against a deterministic baseline. It does not require creating migration revisions as part of this implementation.
A follow-up revision for a single model change must not drop and recreate unrelated FKs or indexes.
Generated Revision Review
The first generated file:
src/migrations/versions/mainsequence_examples/0001_migration.py
fixed deterministic constraint and index naming, but it was still generated
from explicit schema="public" metadata. The second no-op autogenerate check
proved that explicit default-schema metadata is still wrong: Alembic generated
FK drop/create churn where every FK name matched but the reflected side was
schema=None and the metadata side was schema="public".
Observed shape from the initial/base revision:
44op.create_table(...)operations and matching44op.drop_table(...)operations;75op.create_index(...)operations and matching75op.drop_index(...)operations;40embeddedsa.ForeignKeyConstraint(...)declarations;- no standalone
op.drop_constraint(...)calls; - no standalone
op.create_foreign_key(None, ...)calls; - no
op.drop_constraint(None, ...)downgrade operations; - PK, FK, CK, and index names are deterministic
op.f(...)names; - generated
op.f(...)names stay within the PostgreSQL identifier length limit.
Observed shape from the no-op revision attempt:
40FKdrop_constraint(...)operations;40FKcreate_foreign_key(...)operations;- no index churn;
- the FK name sets were identical between drop and create;
- the drop side had no schema argument;
- the create side had
source_schema="public"andreferent_schema="public".
That proves the remaining issue was default-schema normalization, not FK naming.
The no-op autogenerate check has been completed after default-schema normalization. Unchanged FKs and indexes no longer produce unrelated churn.
The reviewed revision includes the mainsequence_examples namespace suffix in
physical table names. That is valid only when the intended migration target is
the mainsequence.examples namespace. Future generated revisions must be
reviewed for accidental namespace suffix changes before they are accepted.
Implementation Tasks
- [x] Add regression tests that reproduce the current bad autogenerate inputs:
explicit
publicmetadata, reflected/default-schema comparison, unnamed FKs, and duplicate/overlong index names. - [x] Use SDK naming helpers from
mainsequence.meta_tablesfor table, FK, PK, UQ, CK, and index names instead of carrying a duplicatemsm.schema_namesmodule. - [x] Keep table-name normalization/truncation logic out of
src/msm/base.py;msm.baseimports the SDK helper and only supplies the markets app prefix and namespace suffix. - [x] Implement bounded-name generation that always stays within
POSTGRES_IDENTIFIER_MAX_LENGTHand uses a deterministic digest from the full semantic payload. - [x] Add unit tests for naming normalization, suffix handling, truncation, digest stability, collision resistance for same-first-column indexes, and the PostgreSQL length boundary.
- [x] Configure
MarketsBase.metadatawith a SQLAlchemy naming convention for PKs, FKs, indexes, unique constraints, and check constraints. - [x] Verify loaded provider metadata has no unnamed FK constraints and no unnamed or over-limit indexes.
- [x] Normalize the PostgreSQL default schema so
publicis authored asschema=Nonein SQLAlchemy metadata and the Alembic version table schema. - [x] Override the SDK default table-construction path in the
ms-marketsmixins so default-schema tables are not forced back toschema="public". - [x] Update
src/migrations/env.pysoinclude_schemasis enabled only when provider metadata actually uses non-default named schemas. - [x] Review the generated
src/migrations/versions/mainsequence_examples/0001_migration.pybaseline for deterministic PK/FK/CK/index names and absence of standalone FK drop/create churn. - [x] Verify Alembic autogenerate no longer emits FK drop/create pairs for
unchanged FKs in
public. - [x] Verify Alembic autogenerate no longer emits index churn for unchanged single-column/composite index pairs.
- [x] Add a focused test or scripted check that upgrades to head, runs autogenerate again without model changes, and confirms no schema operations are produced.
- [x] Resolve the git/index state for
src/migrations/versions/mainsequence_examples/0001_migration.pyso the reviewed generated file is tracked normally instead of appearing as a staged delete plus an untracked replacement. - [x] Update MetaTable migration docs to state that default PostgreSQL
publicschema is authored asschema=None, named schemas are opt-in, deterministic naming is required, and generated revisions with unrelated FK/index churn must be rejected.
Consequences
Alembic revision review becomes stricter: generated files with unrelated FK/index churn are considered invalid input, not something to manually clean up and commit.
ms-markets keeps natural SQLAlchemy FK/index declarations in model files, but
the package owns the physical naming contract through one naming module and
MarketsBase.metadata.
The naming module becomes a long-lived compatibility boundary. Once a migration has been released, changing name-generation rules is itself a migration concern because it can cause Alembic to see rename/drop/create operations.
If a future provider uses a real non-default schema, include_schemas must be
enabled and provider filters must be tested so Alembic does not scan unrelated
tables in the same database.