Python class generation from YAML
YAML - I remember using YAML extensively back in the day. It somehow fell out of favour, there was a new kid on the block - JSON. This story is not about JSON.
I'm using the pyYaml library in this write-up.
# companies.yaml
- !Company
company_name: PoundCompany
active: True
has_dimensions: True
dimensions:
- sillyname1
- sillyname2
file_mapping:
account_number_map: accounts.csv
account_type_map: accounts.csv
tax_code_map: taxcodes.csv
bi_central_uuid: 00000000-0000-0000-0000-000000000001
- !Company
company_name: EuroCompany
active: True
has_dimensions: True
dimensions:
- sillyname1
- sillyname2
file_mapping:
account_number_map: accounts.csv
account_type_map: accounts.csv
tax_code_map: taxcodes.csv
bi_central_uuid: 00000000-0000-0000-0000-000000000002
This YAML describes two instances of the initialized class.
#company.py
from dataclasses import dataclass, field
@dataclass
class CompanyModel:
company_name: str = field(default="")
active: bool = False
has_dimensions: bool = False
dimensions: list = field(default_factory=list)
file_mapping: dict = field(default_factory=dict)
bi_central_uuid: str = field(default="")
And this is my class. It might surprise you that I've chosen to name it xxxModel. To me, this is a model. A collection of attributes that I, at some point, am going to serialize to JSON before posting it on the Business Central API. This class is basically data.
The task at hand is to go from YAML to (model)Class!
The pyYaml documentation has a few examples on how to do this, none of which I found particularly interesting.
Onwards! Let's start with reading the YAML
# company_loader.py
import yaml
from protocols import Settings, Company
from settings import SETTINGS
from company import CompanyModel
from beartype import beartype
@beartype
def get_companies(settings: Settings):
with open(settings.base_dir / "companies.yaml", "rb") as fp:
return yaml.load(fp, Loader=get_loader())
The interesting part here is get_loader(). Let's define it.
# company_loader.py
@beartype
def get_loader()->type[yaml.SafeLoader]:
loader = yaml.SafeLoader
loader.add_constructor("!Company", company_constructor)
return loader
Now we have created the loader and we've added a constructor to it. The constructor does all the heavy lifting and will construct our class. There are a few constructors in pyYaml. To be honest, I gave yaml.SafeLoader.construct_mapping() a go and never figured out why it didn't produce lists and dicts. This is far from rocket science and creating a constructor is relatively easy
This might be a good time to explain why I created this contraption. I needed to move data (appearing in one and only one static format) to the BiCentral API. No room for interpretation.
Anyway. You'll find code in pyYaml that resembles the below function - the main difference being that I understand what my code does and it does exactly what I need.
# company_loader.py
@beartype
def company_constructor(
loader: yaml.SafeLoader,
node: yaml.nodes.MappingNode,
)-> Company:
"""Construct an employee."""
_node = {}
def process_node(node): -> None
for _n in node.value:
attr_name, attr_value = _n
if isinstance(attr_name, yaml.MappingNode):
process_node(attr_name)
_node[attr_name.value] = ""
_cast: str = attr_value.tag.split(":")[-1]
if _cast in ["bool", "int"]:
_node[attr_name.value] = eval(attr_value.value)
elif _cast == "seq":
_node[attr_name.value] = [
item.value for item in
attr_value.value
]
elif _cast == "map":
_node[attr_name.value] = {}
_node[attr_name.value].update(
{
(key.value, val.value) for (key, val)
in [
item for item in
attr_value.value
]
},
)
else:
_node[attr_name.value] = attr_value.value
process_node(node)
return CompanyModel(**_node) # type: ignore
As you can see, the process_node() function is recursive and it is tailored to my needs. I don't need to call eval on items in my list or dict instances so naturally, I don't. When it's done I unpack the resulting dict and return an initialized CompanyModel object.
Before wrapping up I'd like to mention two minor details. I really like protocols (Thank you Arjan) and beartype - I also like beartype, a lot.
# protocols.py
from typing import Protocol
from pathlib import Path
from beartype import typing
@typing.runtime_checkable
class Settings(Protocol):
base_dir: Path
@typing.runtime_checkable
class Company(Protocol):
company_name: str
active: bool
has_dimensions: bool
dimensions: list
file_mapping: dict
bi_central_uuid: str
You'll find the SETTINGS class somewhere else on this blog.
The main takeaways.
YAML is still cool
Things do not need to get too complicated if you scope it well
Protocols are awesome
Arjan is awesome
Beartyping is next-level awesome