# From Pydantic to msgspec: A Migration Story About Validation and Control ## The Migration That Made Me Question Everything I thought I had Python validation figured out. Pydantic was my trusted companion—elegant, powerful, and seemingly complete. Why would I need anything else? Then I decided to migrate a project from FastAPI to Litestar. The reasons were good: better performance, cleaner architecture, advanced SQLAlchemy integration. But there was a catch—Litestar favored msgspec over Pydantic. And msgspec? I'd barely heard of it. "How hard could it be?" I thought. "Validation is validation, right?" Wrong. So wrong. ## The Pydantic Comfort Zone Let me show you what I was used to. Here's how I'd been writing validation for years: ```python from pydantic import BaseModel, EmailStr, field_validator class UserRegistration(BaseModel): email: EmailStr # Automatic email validation password: str age: int @field_validator('password') @classmethod def validate_password(cls, v: str) -> str: if len(v) < 8: raise ValueError('Password must be at least 8 characters') if not any(c.isupper() for c in v): raise ValueError('Password must contain uppercase letter') return v @field_validator('age') @classmethod def validate_age(cls, v: int) -> int: if v < 18: raise ValueError('Must be 18 or older') return v # Usage in FastAPI @app.post("/register") async def register(data: UserRegistration): # If we get here, data is GUARANTEED valid # Everything is checked: types, email format, password rules, age user = await create_user(data.model_dump()) return {"message": "User created"} ``` Beautiful, right? One model definition, all validation in one place. The moment you create a `UserRegistration` instance, Pydantic validates *everything*—types, formats, business rules. If the data reaches your endpoint handler, you know it's valid. This felt complete. Elegant. *Right*. ## First Contact: Where Are My Validators? Fast forward to my Litestar migration. I started converting my Pydantic models to msgspec: ```python import msgspec class UserRegistration(msgspec.Struct): email: str password: str age: int # Usage in Litestar @post("/register") async def register(data: UserRegistration) -> dict: # Wait... where do I put the email validation? # Where's my password strength check? # How do I validate age >= 18? user = await create_user(msgspec.to_builtins(data)) return {"message": "User created"} ``` I stared at the screen, confused. Where were the decorators? Where was `EmailStr`? How do I add my custom validators? I spent an embarrassing amount of time searching for "msgspec field_validator equivalent" and finding... nothing helpful. My first thought: "This library is incomplete. How can people use this?" My second thought: "Maybe I'm missing something." Spoiler: I was missing something big. ## The Uncomfortable Truth: Validation Comes in Two Flavors After hours of frustration (and reading way too much documentation), I had a realization: **msgspec and Pydantic have fundamentally different philosophies about what validation means**. Here's what finally clicked for me: ### Pydantic's Philosophy: "Everything, Everywhere, All at Once" Pydantic believes validation should happen *at the model boundary*. When you create a model instance, **everything** gets validated—types, formats, business rules, constraints. The model definition is the single source of truth. ```python # With Pydantic: ONE validation point user = UserRegistration(**incoming_data) # ↑ If this succeeds, ALL validation passed # Types? ✓ Email format? ✓ Password rules? ✓ Age check? ✓ ``` ### msgspec's Philosophy: "Fast Types, You Handle the Rest" msgspec believes in separating concerns: - **Type validation**: Lightning-fast, automatic, guaranteed, Just like Pydantic - **Business validation**: Your application, your rules, your code ```python # With msgspec: TWO validation layers user = UserRegistration(**incoming_data) # ↑ Types validated (str, int, etc.) - FAST # ↓ Business rules - YOUR CODE if len(user.password) < 8: raise ValueError("Password too short") ``` This felt wrong at first. Why split validation? Isn't Pydantic's approach more... complete? But then I started to understand the advantages. ## The Lightbulb Moment: Explicit Control Here's what changed my mind: **msgspec gives you explicit control over when and how business validation happens**. Let me show you what I mean with a real example from my migration. ### The Password Reset Problem In my app, users can register with a password OR reset their password. Different endpoints, different validation needs: **With Pydantic**, I had to do this: ```python class UserRegistration(BaseModel): email: EmailStr password: str full_name: str @field_validator('password') @classmethod def validate_password(cls, v: str) -> str: # Always enforces strong password if len(v) < 8: raise ValueError('Password must be at least 8 characters') return v class PasswordReset(BaseModel): email: EmailStr password: str reset_token: str @field_validator('password') @classmethod def validate_password(cls, v: str) -> str: # Duplicate validation code! if len(v) < 8: raise ValueError('Password must be at least 8 characters') return v ``` Notice the duplication? I'm defining the same password validation in two places. If I change the rules (say, require special characters), I need to update both validators. **With msgspec**, I could do this: ```python class UserRegistration(msgspec.Struct): email: str password: str full_name: str class PasswordReset(msgspec.Struct): email: str password: str reset_token: str # Validation logic lives in ONE place - the service class UserService: @staticmethod def validate_password(password: str) -> None: """Centralized password validation""" if len(password) < 8: raise ValueError("Password must be at least 8 characters") if not any(c.isupper() for c in password): raise ValueError("Password must contain uppercase") if not any(c.isdigit() for c in password): raise ValueError("Password must contain number") async def register(self, data: UserRegistration): self.validate_password(data.password) # Reusable! # ... create user async def reset_password(self, data: PasswordReset): self.validate_password(data.password) # Same validation! # ... reset password ``` Suddenly it clicked: **by separating type validation from business validation, msgspec lets me reuse validation logic across different contexts**. ## Wait, But msgspec Structs CAN Have Methods Here's where I had another revelation. I assumed msgspec Structs were just dumb data containers. They're not! ```python class UserRegistration(msgspec.Struct): email: str password: str age: int def validate(self) -> None: """Custom validation - attached to the model!""" # Email validation if "@" not in self.email or "." not in self.email.split("@")[-1]: raise ValueError("Invalid email format") # Password validation if len(self.password) < 8: raise ValueError("Password must be at least 8 characters") if not any(c.isupper() for c in self.password): raise ValueError("Password needs uppercase letter") # Age validation if self.age < 18: raise ValueError("Must be 18 or older") # Usage in controller @post("/register") async def register(data: UserRegistration) -> dict: # Types already validated by msgspec # Business validation is explicit data.validate() user = await create_user(msgspec.to_builtins(data)) return {"message": "User created"} ``` So I *could* attach validation to the model, just like Pydantic. The difference? **It's explicit**. I call `.validate()` when *I* decide it's time. This gives me control Pydantic doesn't: - Validate at different points in the request lifecycle - Skip validation for trusted internal calls - Customize validation based on context - Separate validation errors from deserialization errors ## The Performance Surprise I hadn't even been thinking about performance—I just wanted my validation to work. But then I ran some benchmarks. ```python import msgspec import pydantic import time class PydanticUser(pydantic.BaseModel): id: int email: str full_name: str is_active: bool class MsgspecUser(msgspec.Struct): id: int email: str full_name: str is_active: bool data = { "id": 12345, "email": "user@example.com", "full_name": "John Doe", "is_active": True } # Benchmark Pydantic start = time.perf_counter() for _ in range(100_000): user = PydanticUser(**data) pydantic_time = time.perf_counter() - start # Benchmark msgspec start = time.perf_counter() for _ in range(100_000): user = msgspec.convert(data, type=MsgspecUser) msgspec_time = time.perf_counter() - start print(f"Pydantic: {pydantic_time:.3f}s") print(f"msgspec: {msgspec_time:.3f}s") print(f"msgspec is {pydantic_time / msgspec_time:.1f}x faster") # Output (Using: Ubuntu 24.04.3 LTS, Python 3.14.0b2): # Pydantic: 0.053s # msgspec: 0.017s # msgspec is 3.2x faster ``` **3.2x faster**. For type validation alone. I wasn't even optimizing for performance, but suddenly my API endpoints felt snappier. Response times dropped. Under load, the difference would be even more noticeable. ## Side by Side: The Complete Picture Let me show you both approaches for the same real-world scenario—user registration with email, password, and age validation. ### The Pydantic Way ```python from pydantic import BaseModel, EmailStr, field_validator class UserRegistration(BaseModel): email: EmailStr password: str age: int full_name: str | None = None @field_validator('password') @classmethod def validate_password(cls, v: str) -> str: if len(v) < 8: raise ValueError('Password must be at least 8 characters') if not any(c.isupper() for c in v): raise ValueError('Password must contain uppercase') if not any(c.isdigit() for c in v): raise ValueError('Password must contain number') return v @field_validator('age') @classmethod def validate_age(cls, v: int) -> int: if v < 18: raise ValueError('Must be 18 or older') if v > 120: raise ValueError('Invalid age') return v # FastAPI endpoint @app.post("/register") async def register(data: UserRegistration): # Everything validated automatically user = await user_service.create(data.model_dump()) return {"id": user.id} ``` **Pros:** - All validation in one place - Automatic on instantiation - EmailStr handles format validation - Clean endpoint code **Cons:** - Validation logic tied to request model - Hard to reuse validators across models - Can't skip validation for internal use - Slower serialization ### The msgspec Way ```python import msgspec class UserRegistration(msgspec.Struct): email: str password: str age: int full_name: str | None = None def validate(self) -> None: """Business validation - called when YOU want""" # Email validation if "@" not in self.email or "." not in self.email.split("@")[-1]: raise ValueError("Invalid email format") # Password validation if len(self.password) < 8: raise ValueError("Password must be at least 8 characters") if not any(c.isupper() for c in self.password): raise ValueError("Password must contain uppercase") if not any(c.isdigit() for c in self.password): raise ValueError("Password must contain number") # Age validation if self.age < 18: raise ValueError("Must be 18 or older") if self.age > 120: raise ValueError("Invalid age") # Litestar endpoint @post("/register") async def register(data: UserRegistration) -> dict: # Types validated automatically by msgspec # Business validation is explicit data.validate() user = await user_service.create(msgspec.to_builtins(data)) return {"id": user.id} ``` **Pros:** - Blazing fast type validation - Explicit validation control - Can reuse validation logic - Can skip validation when appropriate - Clear separation of concerns **Cons:** - Must remember to call `.validate()` - No built-in EmailStr equivalent - More manual validation code - Endpoint code has one extra line ### Alternative: Validation in Service Layer With msgspec, you can also move validation to the service: ```python class UserRegistration(msgspec.Struct): email: str password: str age: int full_name: str | None = None class UserService: async def create(self, data: dict) -> User: """Validate during creation""" # Email validation + uniqueness check if "@" not in data["email"]: raise ValueError("Invalid email") if await self.repository.email_exists(data["email"]): raise ValueError("Email already registered") # Password validation (reusable across endpoints!) self._validate_password(data["password"]) # Age validation if data["age"] < 18: raise ValueError("Must be 18 or older") return await self.repository.create(data) @staticmethod def _validate_password(password: str) -> None: """Centralized password validation""" if len(password) < 8: raise ValueError("Password too short") # ... more checks # Controller stays clean @post("/register") async def register(data: UserRegistration) -> dict: # Types validated by msgspec # Business rules validated in service user = await user_service.create(msgspec.to_builtins(data)) return {"id": user.id} ``` This pattern centralizes business logic where it belongs—in the service layer, not the request model. ## When to Use Which? After living with both libraries, here's my decision matrix: ### Choose Pydantic When - **You want automatic validation everywhere** - Set it and forget it - **You're building a standard CRUD API** - Pydantic's patterns are well-established - **Your team prefers "magic"** - Less boilerplate, more automatic behavior - **You need the ecosystem** - Pydantic has integrations everywhere - **Performance isn't critical** - Still fast enough for most use cases ### Choose msgspec When - **Performance matters** - High-throughput APIs, real-time systems - **You want explicit control** - You decide when validation happens - **You value separation of concerns** - Business logic in services, not models - **You're using Litestar** - Native integration, optimal performance - **You need validation reuse** - Share logic across different contexts ### You Can Use Both Here's a secret: they can coexist. Use Pydantic for complex validation where you want the ecosystem, msgspec for high-performance paths. ```python # Pydantic for complex admin operations class ComplexUserUpdate(BaseModel): # Use all of Pydantic's validation features ... # msgspec for high-frequency endpoints class UserListResponse(msgspec.Struct): # Fast serialization for list endpoints ... ``` ## What I Learned This migration taught me that **there's no one "right" way to do validation**. Pydantic and msgspec represent different trade-offs: - **Pydantic** says: "Validation should be automatic and comprehensive" - **msgspec** says: "Validation should be explicit and fast" Both philosophies are valid. Both have their place. I started this migration frustrated, thinking msgspec was incomplete. I ended it appreciating how **explicit control over validation** can actually make code clearer and more maintainable. The `.validate()` call that felt like extra boilerplate? Now I see it as documentation—a clear signal that "validation happens here." The manual validation methods? They're reusable functions I can test independently. And the performance? That was just a nice bonus I wasn't even looking for. ## The Migration Continues My Litestar migration is still ongoing, but I'm no longer fighting msgspec. I'm embracing it. The initial confusion has given way to appreciation. Would I go back to Pydantic for everything? No. Would I use msgspec for everything? Also no. The real lesson: **understand your tools' philosophies**. Pydantic isn't "complete" and msgspec isn't "incomplete"—they're just optimized for different values. Knowing when to use which makes you a better developer. And sometimes, the libraries you think are missing features are actually giving you something more valuable: control. --- *(Written by Human, improved using AI where applicable.)*