Skip to content

2023#

ChatGPT vs Me: As a Children's Authors

For all the right reasons, large language models have taken the world by storm! It’s pretty impressive what ChatGPT can do - unprecedented coherence, relevance and tone of delivery for a synthetic text generation. Well done OpenAI!

I have to admit I have strongly resisted the urge to use it for purposes beyond joy, and exploration. It's been very hard to abstain, especially because I am extremely passionate about AI and also I work in AI. I truly, madly, deeply, utterly fear that we, as a community of carers of young people, are simply NOT ready for it! It’s the infancy state of due diligence to understand the societal implications and governance around it that scares me the most. I fear that while we, the grown-ups, are probably ready and raring to go at it, it’s our younger generations who are not at all ready for it. Once this genie is out of the bottle, out is just the way! And it’s the repeat of our social media fiasco all over again. Perhaps I am coming from a dark place of seeing how exponentially easily our kids are getting social media bullied and exploited with authorities still scrambling even after the fact on how to contain it!

On a lighter note, selfishly speaking, I also fear depleting my creative and critical thinking abilities in favour of convenience and productivity. If Darwin’s theory is anything to go by, that’s one human talent we should preserve and reinforce, I think. Did I hear you say “But you can still be critical of the presented text?”? Sure, but we are also a Large Language Model in many ways :) - we get biased and we are lazy. Is not laziness the inspiration for many useful inventions of our times?

As a mother of young kids and as someone who believes in lifelong learning, I made a tiny effort last year to write chapter books for early readers covering the discoveries of the great scientists of our past and present. In my debut release, I featured Sir Isaac Newton’s discoveries in a book titled “Curious Cassie's beach ride quest: A Celebration of Isaac Newton's Discoveries”. I tested the book on a sample of 5 kids all ranging from 6-7 years and observed extreme enthusiasm and an increased number of questions thereafter. Mission accomplished, eh?

I failed this morning! I failed to resist the chatGPT urge to see what it would write instead. So here’s Me vs chatGPT, a debut author vs a stochastic parrot:

ChatGPT vs Me: Show me the script

ChatGPT

Prompt

Write a chapter book for 6 year old kids

ChatGPT's Response
Chapter 1: The Magical Forest

It was a beautiful day in the magical forest. The sun was shining and the birds were singing. The trees were tall and green, and the flowers were blooming.

In the center of the forest was a small cottage, where a young girl named Lily lived with her parents. Lily was six years old, and she loved to explore the forest.

One day, while out on a walk, Lily stumbled upon a strange and magical place. It was a clearing filled with glittering diamonds and sparkling jewels.

"Wow!" exclaimed Lily. "I've never seen anything like this before!"
As she reached out to touch one of the diamonds, a bright light shone and a magical fairy appeared before her.

"Hello, Lily," said the fairy. "I am the guardian of the forest, and I have been waiting for you. You have a very special task to complete."

"Me?" asked Lily. "What can I do?"

"There is a wicked witch who has been causing trouble in the forest," explained the fairy. "She has been turning all of the animals into stone, and we need your help to stop her."

"I'll do it!" said Lily, determined to help the animals.

Chapter 2: The Wicked Witch

Lily set out on her journey to find the wicked witch. She followed the fairy's instructions and travelled deep into the forest.
As she walked, she saw that the trees were no longer green and the flowers had withered away. It was clear that the witch had been causing destruction wherever she went.

Finally, Lily came to a dark and gloomy castle. She knew this was where the witch lived.

She tiptoed through the castle, trying not to make a sound. She could hear the witch cackling in the next room.

Lily peeked through the door and saw the witch stirring a bubbling pot. She had a long, crooked nose and sharp, pointed teeth.

"Ah-ha!" shouted the witch, turning to face Lily. "I've been expecting you, little girl. You won't be able to stop me from turning all of the animals into stone!"

But Lily was not afraid. She remembered the fairy's words and knew she had to be brave.

"I won't let you harm any more animals!" said Lily, standing up to the witch.

The witch cackled and raised her wand, but Lily was quick. She grabbed a nearby broom and swung it at the witch, knocking the witches wand out of her hand.

The witch let out a cry of rage and fled the castle, leaving Lily to save the animals.

Chapter 3: The Great Celebration

With the witch gone, Lily was able to use her magic to turn the animals back to normal. The forest was filled with joy and celebration.

The fairy thanked Lily for her bravery and offered to grant her one wish.

"I wish for the forest to always be a happy and safe place," said Lily.

The fairy nodded and waved her wand, granting Lily's wish. From then on, the magical forest was filled with happiness and peace.
Lily returned home to her parents, who were overjoyed to see her safe and sound. They threw a great celebration in her honour, and the whole forest came to join in the fun.

Lily knew that she would always treasure her adventure in the magical forest, and she hoped to have many more adventures in the future.
Verdict

Very impressive choice of theme, tone and word selection. It did not give me a title, so I asked to name the title and it responded with "The Brave Explorer: A Magical Adventure in the Forest." Very cool!

Prompt

Write a chapter book for 6 year old kids on STEM

ChatGPT's Response
Chapter 1: The Science Fair

It was a beautiful day in the town of Maplewood, and Lily was excited for the annual science fair at her school. She had been working on her project for weeks, and she couldn't wait to show it off.

Lily's project was a model of the solar system, complete with miniature planets made out of clay and a sun made out of papier-mâché. She had worked hard to make sure everything was accurate and to explain the orbits of the planets in a way that other kids could understand.

As the science fair began, Lily set up her project at her table and waited for the judges to come by. She was a little nervous, but she knew she had done her best.

The judges were impressed by Lily's project and asked her lots of questions about the solar system. Lily was happy to answer and explain what she had learned.

In the end, Lily's project won first place in the science fair! She was so proud of herself and couldn't wait to show her parents.

Chapter 2: The Technology Club

After the science fair, Lily's teacher, Mrs. Kim, asked her if she would like to join the Technology Club at school. Lily was excited at the chance to learn more about science and technology, and she eagerly accepted.

In the Technology Club, Lily and her friends learned about all sorts of different topics, from coding and robotics to engineering and renewable energy. They built and programmed robots, designed and tested bridges, and learned about how solar panels work.

Lily loved the Technology Club and couldn't wait to go each week. She learned so much and had a lot of fun doing it.

Chapter 3: The Engineering Competition

One day, Mrs. Kim announced that the Technology Club was going to participate in an engineering competition. The students would have to design and build a Rube Goldberg machine, a complicated contraption that could perform a simple task in a creative and convoluted way.

Lily and her friends were up for the challenge and set to work planning and building their machine. They worked together as a team, using their knowledge of physics, engineering, and problem-solving to make their machine work.

Finally, the day of the competition arrived. Lily and her friends were nervous but excited to see how their machine would perform.

As they presented their machine to the judges, it worked perfectly, performing the simple task with a series of intricate steps. Lily and her friends were thrilled when they were announced as the winners of the competition!

Chapter 4: The Future Scientists

After the engineering competition, Lily and her friends in the Technology Club knew that they wanted to continue exploring science and technology in the future. They were already thinking about new projects and competitions they could participate in.

Lily knew that she wanted to be a scientist when she grew up, and she was grateful for all of the opportunities she had had to learn and explore in the Technology Club.

She knew that no matter what she did in the future, she would always have a passion for science and technology and the desire to learn more.
Verdict

Nice plot setting, nice choice of theme, and context. Nice job. I find it's very peripheral, explaining a lot about the setting but nothing about science. Maybe it’s my prompt. Let's give it another go.

Prompt

Write a 6-7 year old kids’ book on Sir Isaac Newton's discoveries with Cassie as a protagonist

ChatGPT's Response
Cassie and the Apple Tree

Cassie was a curious little girl who loved to learn about the world around her. One day, she was walking through her backyard when she saw an apple fall from a tree.

"Hmm, I wonder why that happened?" Cassie thought to herself. She climbed up the tree to see if she could find out more.

As she reached the top, she saw a man standing under the tree. It was Sir Isaac Newton! Cassie had heard of him before. He was a very smart man who had made many important discoveries.
"Hello, Cassie," Sir Isaac said. "I see you're curious about the apple."

"Yes, I am," Cassie replied. "Why did it fall to the ground?"
"Ah, that's a good question," Sir Isaac said. "I figured out that there is a force called gravity that pulls objects towards the ground. That's why the apple fell."

Cassie was amazed. "Wow, that's so cool!" she exclaimed.
Sir Isaac smiled. "I also made many other discoveries," he told Cassie. "Like the laws of motion, which explain how objects move and change speed. And I invented calculus, which helps us understand how things change over time."

Cassie's head was spinning with all this new information. She couldn't wait to learn more about Sir Isaac's discoveries.

"Thank you for telling me all about this, Sir Isaac," Cassie said. "I can't wait to learn more!"

"You're welcome, Cassie," Sir Isaac replied. "I'm glad you're interested in learning. Remember, the more you learn, the more you'll discover about the world around you."

Cassie smiled and climbed back down the tree, excited to keep learning and discovering new things.
Verdict

Nice choice of words, context is spot on, and tone of delivery is nice too. Some of the phrases are spot on. "Wow, that's so cool!" she exclaimed.” is something I have used too.

Apple and Newton are joined together forever. Cliché, my dear parrot! Cliché! Well done still. Maybe I like more challenges given to the kids, but I still feel it’s too shallow and lacks details. Let’s give it more work and see what ChatGPT does.

Prompt

Write a 6-7 year old kids chapter book on Sir Isaac Newton's discoveries with Cassie as the protagonist with illustrations

ChatGPT's Response
Chapter 1: Cassie and the Apple Tree

[Illustration: Cassie is standing in her backyard, looking up at an apple falling from a tree.]

Cassie was a curious little girl who loved to learn about the world around her. One day, she was walking through her backyard when she saw an apple fall from a tree.

"Hmm, I wonder why that happened?" Cassie thought to herself. She climbed up the tree to see if she could find out more.

[Illustration: Cassie is climbing up the tree, with a determined look on her face.]

As she reached the top, she saw a man standing under the tree. It was Sir Isaac Newton! Cassie had heard of him before. He was a very smart man who had made many important discoveries.

"Hello, Cassie," Sir Isaac said. "I see you're curious about the apple."

"Yes, I am," Cassie replied. "Why did it fall to the ground?"

"Ah, that's a good question," Sir Isaac said. "I figured out that there is a force called gravity that pulls objects towards the ground. That's why the apple fell."

[Illustration: Sir Isaac is pointing to a diagram of the earth and the apple, with an arrow labeled "gravity" pointing from the earth to the apple.]

Cassie was amazed. "Wow, that's so cool!" she exclaimed.

Chapter 2: The Laws of Motion

After learning about gravity, Cassie couldn't wait to find out more about Sir Isaac's discoveries. She asked him to tell her more.

"I also made many other discoveries," Sir Isaac said. "Like the laws of motion, which explain how objects move and change speed."

[Illustration: Sir Isaac is holding a book labeled "Laws of Motion," with diagrams of moving objects in the background.]

Cassie was fascinated. Sir Isaac explained that there are three laws of motion:

    Objects at rest stay at rest, and objects in motion stay in motion with the same speed and direction, unless acted upon by a force.
    The force needed to change the motion of an object is equal to the mass of the object times its acceleration.
    For every action, there is an equal and opposite reaction.

[Illustration: Sir Isaac is pointing to a chalkboard with the three laws of motion written on it, with illustrations of objects in motion on either side.]

"These laws help us understand how objects move and behave in the world around us," Sir Isaac said.

Chapter 3: Calculus

Cassie couldn't believe how much she was learning from Sir Isaac. She asked him if there was anything else he had discovered.

"Yes, I also invented calculus," Sir Isaac said. "It helps us understand how things change over time."

[Illustration: Sir Isaac is holding a book labeled "Calculus," with a diagram of a curve on the cover.]

Cassie was a little confused. "What do you mean?" she asked.

Sir Isaac explained that calculus is a way of studying how things change. "For example, we can use calculus to understand how fast a ball is rolling down a hill, or how fast a plant is growing," he said.
[Illustration: Sir Isaac is pointing to a graph with a curve on it, with a ball rolling down a hill in the background.]
Cassie was amazed. "That's so cool!" she exclaimed.
Chapter 4: The End
After learning all about Sir Isaac's discoveries, Cassie was even more excited to keep learning and discovering new things. She thanked Sir Isaac for all of his help and said goodbye.
"Remember, Cassie," Sir Isaac said as she walked away. "The more you learn, the more you'll discover about the world around you."
[Illustration: Cassie is waving goodbye to Sir Isaac, with a big smile on her face.]
Cassie smiled and waved, excited to keep exploring and learning. She knew that there were so many more mysteries to solve and discoveries to be made.
Verdict

I love this line "The more you learn, the more you'll discover about the world around you." ChatGPT, You can be nice if not exploited! I liked the mention of calculus. It’s a brilliant thought but I stopped at algebra in my book!

However, I remain unsatisfied given the lack of detail and meat around the topic. I feel, for now, human writing is more creative, rich and informative! I am glad that’s the way it is!

Me! Me!

Title: Curious Cassie's beach ride quest: A Celebration of Isaac Newton's Discoveries

[Illustration: Cassie is riding in a car with her family! She is buzzing with interesting curious keywords like Wow, What, How, When, Where! Sir Isaac is featured in the book.]

Here's a sample of my book. It shows 1st chapter of the 3-chapter book.

I am just a debut author, writing out of hobby. I am profoundly impressed by authors who have written brilliant books to teach, guide and shape our children. I hope we will continue to create far-enriched content to shape ourselves and our future with or without AI - just as long as we are in control.

Thanks a lot for reading! If you ever see any value in spreading the word about my book or procuring it for your loved ones then please do so, I would be thrilled!

Disclaimer: This is a personal post. Views shared are my own and do not represent my employers.


Pydra - Pydantic and Hydra for configuration management of model training experiments

How do you manage the configurations of your model training experiments? Having good configuration management in place improves the user experience and simplifies experiment management. Some of the indirect advantages of having a good config solution are clean, reliable, and simple application code. This is true for any software application however some applications demand higher investment on the config piece than others. One key differentiator perhaps here is how many config fields one has and how each of these relates to the other. This can get real messy real quick with deep learning experiments where the list of config and hyperparameters can rapidly grow out of hand as soon as one breaks out of adhoc experimentations.

Context

By definition model training configurations are hierarchial. For example, an experiment configuration can be broken down into data and model specifications. While these are loosely coupled specifications, they may enforce some degree of restrictions on each other e.g. model's output specification needs to be in line with data specification. A classifier network needs to output a 10 length vector if the data source is MNIST and 20 if it's PASCAL VOC.

Overall, configuration solution for training experiments has following requirements:

  1. Ability to easily templatize configurations of various components of your training experiments.
  2. Ability to organize the configuration of training in a hierarchical fashion acknowledging the dependency and hierarchy of participating components.
  3. Without duplication, reuse configurations across modules wherever needed by means such as interpolation, expressions.
  4. Ability to compose config using templates and applying customization through overrides, environments etc.
  5. Being able to store and reuse the final assembled/compiled config and reproduce the experiments reliably.
  6. Support structured config consisting of potentially complex, canonical python types with strong validation support beyond type hint. This perhaps is optional if one is not interested in dynamic domain-specific validations - something that can not be achieved in serialized formats like yaml/json.
  7. Ability to version config and enforce a schema on the config specification for reliability and robustness.
  8. Ability to integrate easily into complex distributed launcher frameworks for both local and remote launch.
  9. A configuration management that nicely integrates into your echo-system of tools and is not very dictative.

With that in mind, I evaluated a bunch of tools combination of Hydra and Pydantic stood out as a good choice. In this post, I will go over my learnings of both the tools and also talk about gaps I still see present. Lastly, put forth a tentative solution for this gap. The structure of this post is as follows:

  1. Hydra - its strengths and also, what I think are its weakness.
  2. Pydantic - how it can be used to fill in the void of Hydra
  3. The missing bits of config management from Pydra i.e. both Hydra & Pydantic
  4. The solution to bridging the gap
  5. Conclusion

Note: The samples and code shared in this post are available here.

Hydra

Hydra is a dynamic hierarchical config management tool with the ability to override parts of the config and also to compose the config through various sources (eg command line args, environment variables). It offers a nice solution to templated configurations that can be repurposed dynamically based on your objectives. Beyond rewriting rules, compositions, and interpolations to re-config templates, it also provides a mechanism to represent the config in plain old python object style and can take care of serialization and deserialization itself. On some noteworthy features side - with hydra you can fire multi-process workload on multiple instances of configurations derived from the same template source. This can be a really handy feature for many use cases including side-by-side comparisons.

It is also extensible in a way that one can write their plugins to launch local or remote workload. Some other nice noteworthy features are runner config and callbacks. All in all, powered by OmegaConf this is a great tool for config management.

A simple non-structured config example

Let's look at a simple use of hydra for yaml based config without involving python object model for config. This example is borrowed and extended from a hydra sample. . this example demonstrates 3 config modules i.e. db, schema and ui assembled togather via high level config.yaml.

Example hierarchical hydra config.

import logging
from dataclasses import dataclass
from typing import List, Optional

import hydra
from hydra.core.config_store import ConfigStore
from hydra.core.hydra_config import HydraConfig
from omegaconf import MISSING, DictConfig, OmegaConf

log = logging.getLogger(__name__)


Hydra configuration framework. URL: [https://hydra.cc/](https://hydra.cc/)..main(config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
    log.info("Type of cfg is: %s", type(cfg))
    cfg_dict = OmegaConf.to_container(cfg, throw_on_missing=False, resolve=True)

    log.info("Merged Yaml:\n%s", OmegaConf.to_yaml(cfg))

    log.info(cfg.db)
    log.info(cfg.db.driver)

    log.info(HydraConfig.get().job.name)


if __name__ == "__main__":
    my_app()
Sample code can be found here

A few things happening in this script: 1. How is config loaded: The above script tells hydra to look for hierarchical config under conf as a base location for config directory, and config.yaml as top-level config specification. This setting is defaulted by python decorator Hydra configuration framework. URL: [https://hydra.cc/](https://hydra.cc/)..main(config_path="conf", config_name="config"). A user can override these settings and point the config location to another place by using --config-dir for the base directory and --config-name for top-level config.yaml as follows:

python my_app_non_structured.py --config-dir conf_custom_hydra
python my_app_non_structured.py --config-name config_hydra
2. What format can I access config: Once hydra has loaded the config, it will be loaded as dict/dictionary and surfaced as DictConfig from OmegaConf. The benefit DictConfig is you can access the fields as variable accessors e.g. cfg.db.driver as shown in code above instead of dict str key access pattern. 3. Where is composition coming from: What's defined in conf directory is template config. Some values may be missing - which can be represented in yaml as ???. It's also possible to override some values from the template config at runtime. An example of this is: python my_app_non_structured.py db.user=suneeta Here, the final used configuration that the application will see is:
db:
  driver: mysql
  user: suneeta
  pwd: secret
ui:
  create_db: true
  view: true
schema:
  database: school
  tables:
  - name: students
  - name: exams
4. What is the final config used in my run: Hydra saves the specifics of the config used in the run in its output location. By default the output location is outputs/YYYY-MM-DD/HH-MM-SS/.hydra/config.yaml which can be used to rerun the same run for reproducibility.
python my_app_non_structured.py --config-dir=outputs/2022-03-16/13-45-29/.hydra
Hydra also allows you to configure job and run settings, an example of such configuration is located here. Alternatively, one can also override these settings in the main config.yaml file:

defaults:
  - _self_  # Note: to source default of hydra config 
  - db: mysql
  - ui: full
  - schema: school

hydra:
  job:
    env_set: 
      HDF5_USE_FILE_LOCKING: "false"
      ## Useful for distributed train setup
      # RANK: ${hydra.job.num}
    env_copy: ## Local environment copy
      - API_KEY  

  job_logging:
    version: 1
    formatters:
      simple:
        format: "[%(levelname)s] - %(message)s"
    handlers:
      console:
        class: logging.StreamHandler
        formatter: simple
        stream: ext://sys.stdout
    root:
      handlers: [console]

    disable_existing_loggers: false

  run:
    dir: outputs/${hydra.job.name}/${now:%Y-%m-%d_%H-%M-%S}
5. Multi-run and templates: Multi-run allows users to run the multi-process workload on the same template of config but dynamically changing some values of it. Effectively, the user will be running multiple processes of the same code under different configurations. Parameter sweeps are a really good use case for when this may come in handy. In the following run, multirun is doing the trick triggering 3 different processes each for schema of school, support, and warehouse.
python my_app_non_structured.py db.user=suneeta schema=school,support,warehouse  --config-dir conf_custom_hydra --multirun

Each process gets an int id assigned which the application can recognize. The index starts from 0, as shown in fig below.

Example Of multi-runs.

This is especially useful in distributed training runs where things like RANK of process needs to be known. This can be easily configured using hydra as following:

hydra:
  job:
    env_set: 
      RANK: ${hydra.job.num}

An extension using structured config

If your application configs are slightly more complex and you might want object representation of your config then Hydra provided structured config option with ConfigStore. An example of this is here.

In this case, you define your config object model as follows:

Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class DBConfig:
    user: str = MISSING
    pwd: str = MISSING
    driver: str = MISSING
    timeout: int = 100


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class MySQLConfig(DBConfig):
    driver: str = "mysql"


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class PostGreSQLConfig(DBConfig):
    driver: str = "postgresql"


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class View:
    create_db: bool = False
    view: bool = MISSING


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class Table:
    name: str = MISSING


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class Schema:
    database: str = "school"
    tables: List[Table] = MISSING


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class Config:
    # We can now annotate db as DBConfig which
    # improves both static and dynamic type safety.
    db: DBConfig
    ui: View
    schema: Schema
A few things noteworthy to call out here: 1. MISSING is the replacement of yaml equivalent of missing denoted as ???. 2. Hydra can only work with plain old python objects and at best with dataclasses or drop-in replacements of dataclasses. 3. Structured config can not work with python primitives and cant serde (serialize and deserialize) canonical or complex python objects. Not even things like python pathlib Path.

Once, the config object model is done, one should define ConfigStore and register config objects:

cs = ConfigStore.instance()
cs.store(name="config", node=Config)
cs.store(group="db", name="mysql", node=MySQLConfig)
cs.store(group="db", name="postgresql", node=PostGreSQLConfig)
cs.store(name="ui", node=View)
cs.store(name="schema", node=Schema)

Here, group key in config registry indicates mutually exclusive groups. More about these can be found here.

How does hydra injection change then to support config store? Not by a lot, as shown in this example, changing DictConfig to your high-level config object Config in this case does the trick. Then, OmegaConf.to_object(cfg) unpacks dict config in your python object model.

Hydra configuration framework. URL: [https://hydra.cc/](https://hydra.cc/)..main(config_path="conf", config_name="config")
def my_app(cfg: Config) -> None:    
    log.info("Type of cfg is: %s", type(cfg))
    cfg_dict = OmegaConf.to_container(cfg, throw_on_missing=False, resolve=True)

    cfg_obj = OmegaConf.to_object(cfg)

This is all the application needs to do to move to structured config and features discussed earlier can be explored using the following commands:

python my_app.py
python my_app.py db.user=suneeta    
python my_app.py db.user=suneeta --config-dir conf_custom_hydra
# python my_app.py db.user=suneeta --config-name config_hydra
# Multi-run
python my_app.py db.user=suneeta schema=school,support,warehouse  --config-dir conf_custom_hydra --multirun
Code located here

There is another approach Hydra offers to instantiate objects through the config. This approach requires the user to know which config is object-able beforehand as unpacking them into objects requires explicit calls to instantiate API. An example of such use is as follows, notice fully qualified class name in __target__ field in config:

Example Of instantiate objects.

This feature, in my view, should be used sparingly as it tightly couples the config to application-specific objects. This kind of mixed-breed config handling approach is very hard to maintain and understand.

Some other interesting features of hydra

Interpolation

Interpolation offers a means to refer to an existing config instead of duplicating it. It also allows users to use derive values obtained through string interpolation in config. An example of interpolation is shown below:

Example Of instantiate objects.

Hydra Directives

Hydra allows overriding package configuration employing directives; documentation for these is here. In short, they offer means to rewrite and relocate config dynamically. I think this is evil and facilitates yaml engineering. It can also be a debugging nightmare if your intended use of config is beyond standard interfaces.

Example Of directives to rewrite config specifications.

Talking about directives, __global__ are quite handy in managing configuration overrides. One example is shown here on hydra samples itself

Where Hydra falls short?

Overall, OmegaConf and Hydra are two powerful tools for config management, however, I personally find the following shortcoming in these tools:

  1. No support to serde (serialize and deserialize) canonical or complex python objects can be really annoying and causes more overhead for applications. Only primitive supports are a good start.

    1.1. OmegaConf and Hydra does not support Union type in the config either.

  2. Support for structured config type is limited to plain old Python objects or dataclasses (and drop-in replacement thereof).

  3. Validation framework is a must-have in any config management framework. Anything users provide must be validated in the application for various things like type-checking, application-specific validation, constraint limits, etc. This is currently offloaded largely to application. The choice application is left with is using structured config as simple objects and performing all validation explicitly including obvious things like type checking. This is limiting in my view.

  4. Interpolation is great to have feature. Extension to this is being able to run expression-based interpolations eg list of items to the total item in the list. However, this support is lacking in Hydra with issues being raised.

Pydantic

Pydantic offers a runtime type enforcement and validation and is itself a data validation and settings management tool. It works with pure canonical python data models and provides two approaches for this: 1. Drop in replacement of dataclass 2. BaseModel abstraction

Through its validation framework and type checking pydantic offers schema enforcement and can serialize and deserialize objects from various formats including but not limited to yaml/json, xml, ORM, etc. Its field type support especially constrained types are very rich and vastly simplifies application config validations. Eg can the image size be less than 0? What's the best approach to validate this? limit at config level or sprinkle that in the application where config is used?

Overall, pydantic is a well-documented library, an example of its dataclass use is as shown below:

import logging
import sys
from pathlib import Path
from typing import Optional, Union

from pydantic import BaseModel, conint, validator
from pydantic.dataclasses import dataclass


log = logging.getLogger(__name__)


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class Dataset:
    name: str
    path: str

    Python pydantic validator. URL: [https://docs.pydantic.dev/usage/validators/](https://docs.pydantic.dev/usage/validators/).("path")
    def validate_path(cls, path: str) -> Path:
        if Path(path).exists():
            print("exist")
        return Path(path)


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class Model:
    type: str
    num_layers: conint(ge=3)
    width: Optional[conint(ge=0)]

    Python pydantic validator. URL: [https://docs.pydantic.dev/usage/validators/](https://docs.pydantic.dev/usage/validators/).("type")
    def validate_supported_type(cls, type: str) -> str:
        if type not in ["alex", "le"]:
            raise ValueError(f"'type' canonly be [alex, le] got: {type}")
        return type


Python dataclass. URL: [https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html).
class Config:
    dataset: Dataset
    model: Model
Notice the use of the validator decorator here. This is an approach to validate for application-specific logics.

BaseModel comes with more bells and whistles as it provides syntactic sugar of Config inner class that can be used to configure the behavior of a concrete implementation of BaseModels. Details of this can be found here. Some of the settings like anystr_lower and use_enum_values are so handy. Who does not want to work with case insensitive configurations but keep the application clear of case conversions? Like BCE and bce means any different than binary cross entropy? BaseModel implementation looks like this:

class Dataset(BaseModel):
    name: str
    path: str

    Python pydantic validator. URL: [https://docs.pydantic.dev/usage/validators/](https://docs.pydantic.dev/usage/validators/).("path")
    def validate_path(cls, path: str) -> Path:
        if Path(path).exists():
            print("exist")
        return Path(path)

    class Config:
        title = "Dataset"
        max_anystr_length = 10
        allow_mutation = False
        validate_assignment = True
        anystr_lower = True
        validate_all = True
        use_enum_values = True 
Full example code is located here

BaseSettings is also another great feature that seeks to abstract away settings/config that is sourced from other sources like an environment variable, init files, etc.

class Secrets(BaseSettings):
    api_key: str = Field("", env="api_key")
    token: str

    class Config:
        # case_sensitive = True
        env_nested_delimiter = "_"

        Python class methods. URL: [https://docs.python.org/3/library/functions.html\#classmethod](https://docs.python.org/3/library/functions.html#classmethod).
        def customise_sources(
            cls,
            init_settings: SettingsSourceCallable,
            env_settings: SettingsSourceCallable,
            file_secret_settings: SettingsSourceCallable,
        ) -> Tuple[SettingsSourceCallable, ...]:
            return env_settings, init_settings, file_secret_settings

An few interesting notes

  1. It's interesting to note that the validator of pydantic can be repurposed to implement derived fields. An example of this is captured here: The use-case being, a list of classes is already provided in config in another module say data. You want to "infer" the total number of classes from this in the model config. In this case, you can interpolate and copy over a list of classes from the data module to model but your requirement of actually only caring about total classes is unfulfilled as you cant do expression-based interpolation just yet. In this case, total_classes can be defined as either list of int and then the total is deduced upon validation, leaving your in config object always representing total int value and logic-based deduction happening upon validation. Example of this is as follows:

from typing import List, Union
from pydantic import BaseModel, conint, validator

class Model(BaseModel):
    total_classes: Union[conint(ge=0), List[conint(ge=0)]]

    Python pydantic validator. URL: [https://docs.pydantic.dev/usage/validators/](https://docs.pydantic.dev/usage/validators/).("total_classes")
    def validate_classes(cls, total_classes: Union[conint(ge=0), List[conint(ge=0)]]) -> conint(ge=0):
        if isinstance(total_classes, List):
            return len(total_classes)
        return total_classes
I have not seen any issue in the use of validator like this but unsure if this is the intended use of validator support. Also, unsure why the validator returns in the said interface.

  1. Hydra's benchmarking is pretty favorable (see image below). However, they dont benchmark against OmegaConf and some other related tools, see details here.

Hydra's benchmarks.

Can Pydantic bridge the gaps in hydra

Pydantic offers a strong validation framework something that's been missing in hydra. While drop-in replacement of pydantic dataclass works with hydra one can still miss out on the syntactical sugars of Model Config and still be limited to primitives and non Union types and more interesting constrained types of pydantic.

It's fair to say that direct integration of hydra with pydantic is absent. The follow-up question should be, can we make it work? Sure we can :).

An example demonstrating how Hydra and Pydantic can be used in conjunction is shown here. The secret sauce of integration lies in the following:

  1. Giving up on the use of ConfigStore of hydra
  2. Reading the config as DictConfig and unpacking to Pydantic BaseModel object via dict

In short, this is two-liner:

    OmegaConf.resolve(cfg)
    r_model = Config(**cfg)
where Config is a concrete implementation of Pydantic's BaseModel. The decision we make it here is configs are specified in yaml format and we leave Hydra to manage template, composition, and all the good things of non-structured config management. Upon resolution and merge of config, we deserialize with Pydantic and get all the goodies of Pydantic validation, check, etc.

How well does this integrate with other ecosystem tools

Given hydra can provide a final configuration in dict format, the configuration side can easily integrate with echo-system tools that care about configs. Given most experiment management and tracking tools support managing config as dict, this is less of a challenge. A post from Weights and biases goes in detail about integration.

It's interesting to note that other features of Hydra like multirun and sweeps given they employ multi-processing may interfere with other tools employing MP techniques too.

Version config

Versioning is very important practice to follow if one works with large, actively evolving code base. For the same reasons as REST APIs should be versioned, docker images are versioned, library releases are versioned, and so on, application config specifications should be versioned as well. If not for incremental updates then at least for breaking changes. Kubernetes actually does this really well, where users can bring their resource specification of supported version and it can work with that version creating ../resources for you. There is lot more complexity in config management that Kubernetes immplements but whats transferable here in my view is:

  1. When creating a run, always use latest version of config.
  2. Have a support in place where you can migrate your old configs to latest version of config so you can reproduce [not repeat] your experiments with newer features should you need to.

This prototype is a toy example to explore how to handle versioning in the config and provide a migration path for when old version config is provided. The assumption made here is, your training code only ever works with the latest version of config so the application code is simple and is always progressive. Whenever a breaking change is introduced, a new float version of config is created while the preceding version of config implements a migration path to the following version.

Code shown below, represents Model config which has 3 versions v i.e. 1.0, 2.0, 3.0. Each version makes a breaking change to the previous but each concrete implementation implements a method def to_next(self) -> Optional[BaseModel] that cracks at migrating the config to a new specification.

class VersionModel(BaseModel, abc.ABC):
    v: Literal["inf"] = Field(float("inf"), const=True)

    class Config:
        title = "Model"
        allow_mutation = False
        max_anystr_length = 10
        validate_assignment = True
        anystr_lower = True
        validate_all = True
        use_enum_values = True

    def version(self) -> float:
        return self.v

    def _next_version_model(self) -> float:
        versions = list(fake_registry.keys())
        versions.sort()
        idx = versions.index(self.v) + 1
        if idx == len(versions):
            return None
        return versions[idx]

    def next_version(self) -> BaseModel:
        next_class = fake_registry.get(self._next_version_model())
        return next_class

    Python abstract base classes. URL: [https://docs.python.org/3/library/abc.html](https://docs.python.org/3/library/abc.html)..abstractmethod
    def to_next(self) -> BaseModel:
        pass


class Model1(VersionModel):
    # https://pydantic-docs.helpmanual.io/usage/types/#arguments-to-constr
    type: str  # constr(to_lower=True)
    num_layers: conint(ge=3)
    width: Optional[conint(ge=0)]
    # v: float = Field(1.0, const=True)
    zoom: conint(gt=18) = 19
    v: Literal[1.0] = Field(1.0, const=True)

    # Python class methods. URL: [https://docs.python.org/3/library/functions.html\#classmethod](https://docs.python.org/3/library/functions.html#classmethod).
    # def to_next(cls, instance: Type[VBT]) -> BaseModel:
    def to_next(self) -> Optional[BaseModel]:
        next_class = self.next_version()
        if next_class != Model2:
            raise Exception("WTH")

        logging.warning("================================================")
        logging.warning("==============Migrating from 1->2 ==============")
        logging.warning("================================================")

        d = self.dict()
        d.pop("v")
        return Model2(**d)


class Model2(Model1):
    width: conint(ge=5)
    context: conint(ge=256) = 256
    # v: float = Field(2.0, const=True)
    v: Literal[2.0] = Field(2.0, const=True)

    def to_next(self) -> Optional[BaseModel]:
        next_class = self.next_version()
        if next_class != Model3:
            raise Exception("WTH")

        logging.warning("================================================")
        logging.warning("==============Migrating from 2->3 ==============")
        logging.warning("================================================")

        return Model3(
            new_context=self.context,
            num_layers=self.num_layers,
            type=self.type,
            width=self.width,
            zoom=self.zoom,
        )

class Model3(Model1):
    new_context: conint(ge=512, le=1300) = 512
    v: Literal[3.0] = Field(3.0, const=True)

    def to_next(self) -> Optional[BaseModel]:
        logging.warning("================================================")
        logging.warning("============== Latest no migration =============")
        logging.warning("================================================")

        return None
Config templates, as shown here are versioned as well.

Let's run through various scenarios: 1. A user wants to run a training and uses the latest config, they run python example.py model=resnet_v3 2. A user wants to run a training and uses the latest config but overrides some values, they run python example.py model=resnet_v3 +model.zoom=25 3. A user wants to re-run a training from an old config, they run python example.py. They use the v2 version of config and migration fails because field context require value is greater than or equal to 512 in v3 but used config in v2 template is 256. The user received feedback on what needs to change in config to make it work. 4. the Same user in #3 wants to re-run a training from the old config, they run python example.py model.context=512. They can run and the application performs the migration and the application actually only Model3 version of object config. 5. I want to expose the schema of Model config, how can I do this:

        _Models = Annotated[Union[Model1, Model2, Model3], Field(discriminator="v")]
        _schema = schema_json_of(_Models, title="Model Schema", indent=2)
This produces the following schema for model config:
{
  "title": "Model Schema",
  "discriminator": {
    "propertyName": "v",
    "mapping": {
      "1.0": "#/definitions/Model1",
      "2.0": "#/definitions/Model2",
      "3.0": "#/definitions/Model3"
    }
  },
  "anyOf": [
    {
      "$ref": "#/definitions/Model1"
    },
    {
      "$ref": "#/definitions/Model2"
    },
    {
      "$ref": "#/definitions/Model3"
    }
  ],
  "definitions": {
    "Model1": {
      "title": "Model",
      "type": "object",
      "properties": {
        "v": {
          "title": "V",
          "const": 1.0,
          "enum": [
            1.0
          ],
          "type": "number"
        },
        "type": {
          "title": "Type",
          "type": "string"
        },
        "num_layers": {
          "title": "Num Layers",
          "minimum": 3,
          "type": "integer"
        },
        "width": {
          "title": "Width",
          "minimum": 0,
          "type": "integer"
        },
        "zoom": {
          "title": "Zoom",
          "default": 19,
          "exclusiveMinimum": 18,
          "type": "integer"
        }
      },
      "required": [
        "type",
        "num_layers"
      ]
    },
    "Model2": {
      "title": "Model",
      "type": "object",
      "properties": {
        "v": {
          "title": "V",
          "const": 2.0,
          "enum": [
            2.0
          ],
          "type": "number"
        },
        "type": {
          "title": "Type",
          "type": "string"
        },
        "num_layers": {
          "title": "Num Layers",
          "minimum": 3,
          "type": "integer"
        },
        "width": {
          "title": "Width",
          "minimum": 5,
          "type": "integer"
        },
        "zoom": {
          "title": "Zoom",
          "default": 19,
          "exclusiveMinimum": 18,
          "type": "integer"
        },
        "context": {
          "title": "Context",
          "default": 256,
          "minimum": 256,
          "type": "integer"
        }
      },
      "required": [
        "type",
        "num_layers",
        "width"
      ]
    },
    "Model3": {
      "title": "Model",
      "type": "object",
      "properties": {
        "v": {
          "title": "V",
          "const": 3.0,
          "enum": [
            3.0
          ],
          "type": "number"
        },
        "type": {
          "title": "Type",
          "type": "string"
        },
        "num_layers": {
          "title": "Num Layers",
          "minimum": 3,
          "type": "integer"
        },
        "width": {
          "title": "Width",
          "minimum": 0,
          "type": "integer"
        },
        "zoom": {
          "title": "Zoom",
          "default": 19,
          "exclusiveMinimum": 18,
          "type": "integer"
        },
        "new_context": {
          "title": "New Context",
          "default": 512,
          "minimum": 512,
          "maximum": 1300,
          "type": "integer"
        }
      },
      "required": [
        "type",
        "num_layers"
      ]
    }
  }
}

Conclusion

It's great to have tools like Hydra and Pydantic to simplify config management and deal with validations efficiently. It would be even great if Hydra integrates nicely with Pydantic as well but at least there is a path to using these two tools together.

Launch of Curious Cassie's Beach Ride Quest

We are thrilled to announce the launch of Curious Cassie's Beach Ride Quest - an exciting concept book from the beloved Curious Cassie series!

Curious Cassie's Beach Ride Quest

Join Cassie, Caleb, and their mom on an extraordinary adventure to the beach. Along the way, they will encounter fascinating situations that explore the wonders of motion and gravity, leaving them inspired by the genius of Sir Isaac Newton.

Title: Curious Cassie's Beach Ride Quest: A Celebration of Isaac Newton's Discoveries

Genre: Children's Fiction (picture books, junior fiction)

Publication Date: 2023-11-12

Curious Cassie's Beach Ride Quest

You might also enjoy reading ChatGPT vs Me: As a Children's Author

Get your copy now:

Kindle | Paperback

Read Online: FableFlow Interactive

Stay Connected!

Connect with Cassie: | FableFlow