I have a few questions about this data:
Is the 'steemit' system account included? Both including it and excluding it are relevant in terms of understanding wealth distribution and changes to wealth distribution. Either way it is a huge factor so please specify.
Does the groupwise power-down data net-out power ups. If one whale account is powering down and another (with either the same or different owner; more on this later) is powering up that does not indicate any kind of distribution across the wealth spectrum (or otherwise, in the case of a common owner).
A clear caveat with any analysis of this form is that accounts are not the same as people. There is a very serious Streetlight effect and its importance can't be overstated. I have >1000 accounts. How much stake do I have in the accounts other than the obvious ones (@smooth and @smooth.witness)? No one really knows the answer to that, and I suspect in the case of other whales (and even non-whales, or more importantly non-identified whales) the answer is even more significant, perhaps by a very large degree. Conversely, accounts may have shared ownership where a single large account may not be evidence of concentrated ownership at all (we are seeing more of this now with "group" accounts like @robinhoodwhale being created). This degree of opacity on the distribution of wealth ownership may be frustrating to data junkies and community members alike but it is reality and disliking that fact should not be taken as a reason to disregard it.
1.) Yes, steemit account is included. I don't know of any reasons for not doing so.
2.) This is a very good idea. I was only interested in a total power-up vs power-down delta over a period of time, however your question begs for development of a more useful metric, in regards to wealth redistribution. Furthermore, it might be interesting to just extract the re-distributive power-ups, where the whales power up a non-whale account.
3.) As you've pointed out, the ownership of an account is hard to prove. Perhaps a deeper dive into the data and analysis thereof would help identify some whale sub-accounts with limited degree of confidence based on shared behavioral patterns (ie. if a whale created accounts with the intent to bot-vote content, the activity history on all accounts will have a common pattern. If the whale funded those accounts directly from a known whale-account, or a known proxy account, we could make an ownership assumption).
It is certainly an outlier. There aren't any other accounts that have a significant portion of their holdings specifically earmarked to be given away to new users. Another significant portion is earmarked to be sold to provide funding for development. Both require powering down, meaning that any powering down by that account is not a discretionary decision being made on the basis of desire to reduce investment (or transfer stake between accounts). Power downs for the purpose of creating new user accounts do not lead to selling at all (though this would also apply to other power downs intended to power up another account). It isn't owned by a person, it is owned by a company, and we don't know how many owners the company has. Finally, that account also does not vote, so it has no bearing on the distribution of content or curation rewards.
Those are some of the reasons to not include it. On the other side, some of its power downs do hit the market, so it is meaningful to include it in that sense.
It doesn't make sense to me to exclude the largest owner from the ownership table, or the power-down metrics.
Giving a tiny fraction of coins away, or paying for development are all good causes - but not good enough to pretend that the money doesn't exist.
The budget for coins being given away is not 'a tiny fraction', it is half of all the coins in that account. I'm not suggesting that the money doesn't exist, although that is one perspective too. Since the account is part of the platform design, it isn't entirely unlike the undistributed coins (currently about 5 million) in a system like Bitcoin. If Bitcoin had an "undistributed coins" account on the blockchain would you include it? It would be the largest known holder of Bitcoins.
Anyway, my primary purpose in raising the issue was not to argue whether it should or shouldn't be included or how, it was primarily to ask that it be clearly stated. For example, in the statistics on steemd.com/distribution, the steemit account is not included, but that was not specified until I asked that a clarification be added. Now that it is clear, those statistics can be interpreted accordingly. Likewise with yours.
To me it would seem the equivalent of adding @null to similar SBD analysis.