'What are the main differences of NamedTuple and TypedDict in Python / mypy
It seems to me that NamedTuple and TypedDict are fairly similar and the Python developers themselves recognized that.
Concerning the PEP, I would rather add a common section about NamedTuple and TypedDict, they are quite similar and the latter already behaves structurally. What do you think? source
But then Guido seems not so sure about that.
I'm not so sure that NamedTuple and TypedDict are really all that similar (except they are both attempts to handle outdated patterns in a statically-typed world).
So, this is my lazy attempt to get someone else come up with a crisp comparison where the official documentation seems lacking.
Solution 1:[1]
Python and its community are wrestling with the "struct" problem: how to best group related values into composite data objects that allow logical/easy accessing of components (typically by name). There are many competing approaches:
collections.namedtupleinstances- dictionaries (with a fixed/known set of keys)
- attribute-accessible dictionaries (like stuf)
- the attrs library
- PEP 557 dataclasses
- plain old bespoke objects hand-crafted for every struct type
- sequences like
tupleandlistwith implied meanings for each position/slot (archaic but extremely common) - etc.
So much for "There should be one—and preferably only one—obvious way to do it."
Both the typing library and Mypy, like the Python community at large, are simultaneously struggling with how to more effectively define types/schema, including for composite objects. The discussion you linked to is part of that wrestling and trying to find a way forward.
NamedTuple is a typing superclass for structured objects resulting from the collections.namedtuple factory; TypedDict a Mypy attempt to define the keys and corresponding types of values that occur when using fixed-schema dictionaries. They are similar if you're just thinking about "I have a fixed set of keys that should map to a fixed set of typed values." But the resulting implementations and constraints are very different. Are a bag and a box similar? Maybe. Maybe not. Depends on your perspective and how you want to use them. Pour wine and let the discussion begin!
NamedTuple, by the way, is now a formal part of Python.
from typing import NamedTuple
class Employee(NamedTuple):
name: str
id: int
TypedDict started life as an experimental Mypy feature to wrangle typing onto the heterogeneous, structure-oriented use of dictionaries. As of Python 3.8, however, it was adopted into the standard library.
try:
from typing import TypedDict # >=3.8
except ImportError:
from mypy_extensions import TypedDict # <=3.7
Movie = TypedDict('Movie', {'name': str, 'year': int})
A class-based type constructor is also available:
class Movie(TypedDict):
name: str
year: int
Despite their differences, both NamedTuple and TypedDict lock down the specific keys to be used, and the types of values corresponding to each key. Therefore they are aiming at basically the same goal: Being useful typing mechanisms for composite/struct types.
Python's standard typing.Dict focuses on much more homogenous, parallel mappings, defining key/value types, not keys per se. Therefore it is not very useful in defining composite objects that happen to be stored in dictionaries.
ConnectionOptions = Dict[str, str]
Solution 2:[2]
There are a couple of minor differences. Note that those containers haven't been there forever:
- PEP 557 -- Data Classes: Python 3.7
- collections.namedtuple: Python 3?
- typing.NamedTuple: Python 3.6?
- PEP 589 -- TypedDict
I would go for NamedTuple if possible and if I want the values to be frozen. Otherwise I would use a dataclass.
from dataclasses import dataclass
from typing import NamedTuple, TypedDict
from enum import Enum
class Gender(Enum):
MALE = "male"
FEMALE = "female"
## Class definition: Almost the same
@dataclass
class UserDataC:
name: str
gender: Gender
class UserTuple(NamedTuple):
name: str
gender: Gender
class UserNDict(TypedDict):
name: str
gender: Gender
## Object Creation: Looks the same
anna_datac = UserDataC(name="Anna", gender=Gender.FEMALE)
anna_tuple = UserTuple(name="Anna", gender=Gender.FEMALE)
anna_ndict = UserNDict(name="Anna", gender=Gender.FEMALE)
## Mutable values vs frozen values
anna_datac.gender = Gender.MALE
# anna_tuple.gender = Gender.MALE # AttributeError: can't set attribute
anna_ndict["gender"] = Gender.MALE
# AttributeError: 'dict' object has no attribute 'gender'
# anna_ndict.gender = Gender.MALE
## New attribute
# Note that you can add new attributes like this.
# Python will not complain. But mypy will.
anna_datac.password = "secret" # Dataclasses are extensible
# anna_tuple.password = "secret" # AttributeError - named tuples not
# anna_ndict.password = "secret" # AttributeError - TypedDict not
anna_ndict["password"] = "secret"
## isinstance
assert isinstance(anna_tuple, tuple)
assert isinstance(anna_ndict, dict)
Why I prefer NamedTuple over namedtuple
I think it's more intuitive to write and read. Plus you give mypy more possibilities to check:
class UserTuple(NamedTuple):
name: str
gender: Gender
# vs
UserTuple = namedtuple("UserTuple", ["name", "gender"])
Why I prefer tuples over dictionaries
If I don't need things to be mutable, I like if they are not. This way I prevent unexpected side effects
Solution 3:[3]
A TypedDict (in 3.8+) is
A simple typed namespace. At runtime it is equivalent to a plain dict.
whereas a NamedTuple is a "tuple subclass." Note that
Named tuple instances do not have per-instance dictionaries, so they are lightweight and require no more memory than regular tuples.
and (from here)
NamedTuple subclasses can also have docstrings and methods
To put that in my own words, a NamedTuple is more like a custom object, and a TypedDict is more like, well, a typed dictionary.
I haven't checked, but from these descriptions, I would expect NamedTuples to have some (small) runtime and memory advantages over TypedDicts.
However, if you are using an API, for example, that expects a dict, a TypedDict may be preferable since it is a dict (though you can also create a dict from a NamedTuple via its _asdict() method).
Solution 4:[4]
From an excellent book "Python Object-Oriented Programming" by Steven F. Lott and Dusty Phillips:
- For a lot of cases,
dataclassesoffer a number of helpful features with less code writing. They can be immutable, or mutable, giving us a wide range of options.- For cases where the data is immutable, a
NamedTuplecan be slightly more efficient than a frozendataclassby about 5% – not much. What tips the balance here is an expensive attribute computation. While aNamedTuplecan have properties, if the computation is very costly and the results are used frequently, it can help to compute it in advance, something aNamedTupleisn't good at. Check out the documentation fordataclassesand their__post_init__()method as a better choice in the rare case where it's helpful to compute an attribute value in advance.- Dictionaries are ideal when the complete set of keys isn't known in advance. When we're starting a design, we may have throwaway prototypes or proofs of concept using dictionaries. When we try to write unit tests and type hints, we may need to ramp up the formality. In some cases, the domain of possible keys is known, and a
TypedDicttype hint makes sense as a way to characterize the valid keys and value types.
Solution 5:[5]
The NamedTuple is a specific type. As the name suggests it is a tuple that is extended to have named entries.
TypedDict is not a real object, you can't (or at least shouldn't) use it, it is instead used for adding type information (for mypy type checker) to annotate types in scenarios when dictionary has various keys with different types i.e. essentially all places when one should use NamedTuple. It's very helpful to annotate existing code that you don't want to refactor.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | Mark |
| Solution 4 | Tony N |
| Solution 5 | Derek |
