sqlmeta.comparison.type_normalizer

Data Type Normalization for Cross-Dialect Comparison.

This module provides data type normalization to handle dialect-specific type equivalences and variations when comparing SQL objects from different sources.

Key Features: - Normalize type names (INT → INTEGER, VARCHAR2 → VARCHAR) - Handle precision and scale variations - Support cross-dialect equivalences (TEXT vs CLOB) - Dialect-specific transformations

Supported Dialects: - PostgreSQL - Oracle - MySQL - SQL Server - DB2

Classes

DataTypeNormalizer()

Normalizes data types across SQL dialects for comparison.

class sqlmeta.comparison.type_normalizer.DataTypeNormalizer[source]

Normalizes data types across SQL dialects for comparison.

This class handles dialect-specific type equivalences, precision/scale normalization, and cross-dialect type mapping to enable accurate comparison of SQL Model objects from different sources.

Example

>>> normalizer = DataTypeNormalizer()
>>> normalizer.normalize("INT", "postgresql")
'INTEGER'
>>> normalizer.normalize("VARCHAR2(100)", "oracle")
'VARCHAR(100)'
>>> normalizer.normalize("TINYINT(1)", "mysql")
'BOOLEAN'

__init__()[source]: Initialize the data type normalizer.

normalize(data_type: str, dialect: str, precision: int | None = None, scale: int | None = None) → str[source]

Normalize a data type for the given dialect.

Parameters:

data_type – The data type to normalize (e.g., “INT”, “VARCHAR2”)
dialect – The SQL dialect (postgresql, oracle, mysql, sqlserver, db2)
precision – Optional precision value
scale – Optional scale value

Returns:

Normalized data type string

Example

>>> normalizer.normalize("INT", "postgresql")
'INTEGER'
>>> normalizer.normalize("NUMBER", "oracle", 10, 2)
'NUMBER(10,2)'

are_equivalent(type1: str, type2: str, dialect1: str, dialect2: str) → bool[source]

Check if two data types are equivalent across dialects.

Parameters:

type1 – First data type
type2 – Second data type
dialect1 – Dialect of first type
dialect2 – Dialect of second type

Returns:

True if types are equivalent, False otherwise

Example

>>> normalizer.are_equivalent("TEXT", "CLOB", "postgresql", "oracle")
True
>>> normalizer.are_equivalent("INT", "VARCHAR", "mysql", "mysql")
False

extract_precision_scale(data_type: str) → Tuple[int | None, int | None][source]

Extract precision and scale from a data type string.

Parameters:: data_type – Data type with optional precision/scale (e.g., “NUMBER(10,2)”)
Returns:: Tuple of (precision, scale), both may be None

Example

>>> normalizer.extract_precision_scale("VARCHAR(100)")
(100, None)
>>> normalizer.extract_precision_scale("NUMBER(10,2)")
(10, 2)