You are viewing a single comment's thread from:

RE: Fragile state - a development story (technical).

Thanks for the write-up, it is very interesting to find out what the core development looks like and what challenges there are. A very tricky indeed, this one was.

Do you think modularity can help reduce this fragility?

Sort:  

You'd need to expand on what you mean by modularity.

There are many things left to be done that could help prevent making mistakes and/or make it easier to detect and diagnose issues.

  • Reduction of data held in state helps. But most of what we could push out was already moved to Hivemind/HAF. I believe only minor stuff like market history or account metadata still need to be moved out of hived code.
  • There is still a lot of, let's say, technical debt. Some code can be modified for better clarity and to make it less error prone, so people that are not fully familiar with it have easier time understanding. For example, in most of the code we know what kind of asset is being manipulated. In such cases use of general purpose asset can be replaced with strongly typed tiny_asset<> (f.e. HIVE_asset), which makes compiler protect you from accidental mixing of assets. Such changes are usually done on the fly, when particular code is being modified anyway.
  • Regression tests. These are being created every day, we are also developing more testing tools.
  • There is also some stuff in the works that could fit the "modularity" term. But it is done to make certain features possible, not to make the code harder to break. F.e. one guy is working on separation of block_log from code that holds and manipulates state. Once the interface is defined, we will add multiple implementations to choose from, for example making pruning possible. And the other half of separation will be able to be used as "light node" embedded in HAF apps when needed ("state provider" - if you hear this term, it is that feature).

Overall I mean the separation of concerns/functionalities into their own compartments, so that they do their thing internally (in an ideal case, of course). As you know, in this way interactions between modules that can lead to all sorts of unpredicted behaviors are vastly reduced. As far as I understood, you were describing debugging a situation where one concern (RC costs) was being influenced and was influencing other concerns, in various subtle loops. The logic of such interactions gets very complicated and can be extremely hard to debug, and obviously leads to fragility where you make a change somewhere and completely unexpected changes pop up somewhere else.

Would it make a difference if the RC code does its thing internally and returns back its results to the main program? I don't know how feasible that would be either. But simple straightforward logic seems far better than subtle and complicated. Obviously refactoring involves a kind of reconceptualization where numerous little things are organized into robust concepts with clear flow of information.

No no, this one is "all stays within family" type of case. RC code was moved from one place to another and cleaned up. The key point is that if you look at the changes, it "is obvious" they should not influence anything, yet they unexpectedly do. It shows that changing blockchain code is a bit like walking on a minefield. It is also an introduction to the power of new RC stamps mechanism that right after its inception caught such subtle influence that would otherwise most likely never be detected.

I have similar story about ongoing bug with power down (its fix will activate with HF28), or already fixed in v1.27.5, but with wild workaround, problem of strange values on some accounts' properties detected during testing of Balance Tracker app. I might describe them in future articles, since it could be useful case study for people wanting to try their skills on consensus code.