The case against blockchain identity

in #blockchain8 years ago (edited)

The case against blockchain identity

There is a movement underfoot to disrupt the identity landscape — and many startups are looking to blockchain to bridge the gap.

But why?

A blockchain is an append-only data store with decentralized (and incentivised) consensus, providing strong immutability and transaction order guarantees — and this is perfect for solving the real problems of double-spending and censorship in the cryptocurrency world.

Some startups are using the blockchain to effectively replace DNSSEC+SSL— and even if this this approach may help users initiate confidential and authenticated communications with businesses (it doesn’t, and more on this later), no blockchain can provide the reverse assurances — that a person is who they claim to be. This requires centralized, 3rd party verifiers and/or digital signatures.

Some startups are using the blockchain to provide an immutable audit trail — but why, when ordinary hashing can provide evidence of tampering? If a verifier has signed the same hash that equates to the profile a user has provided to you by a user, is that not enough? And if the audit trail is actually stored externally to the blockchain using content-addressable storage (this is pretty much required), then it’s not immutable! It’s just tamper evident!

Some startups are using the blockchain to store cryptographic attestations from trusted verifiers — but why, when this can be transmitted by any available means? And if timing is important, can’t a verifier just include a timestamp in their signature?

Some startups are hashing identities and storing the hash on a blockchain — but why, when the hash by definition already has an immutable relationship with the data it was created from? Can’t they just use IPFS or Swarm? Do they need strong transaction order guarantees?

What is the special sauce that blockchain affords to identity providers that warrants the use of this specific technology instead of boring crypto and decentralized storage?

Let’s find out.


1. KNOW YOUR CUSTOMER

The only way to truly KNOW who you have a relationship with is to obtain both proof of identity and an attestation from a trusted verifier.

The future of identity is fully digital, with cryptographically signed digital identities being issued by governments and other organisations. And notably, the first government to implement a digital identity did not use a blockchain: they used boring, centrally authorised digital certificates. Don’t be surprised if the rest of the world follows exactly in their footsteps (and don’t be surprised if it takes decades).

Today’s businesses are looking for ways to streamline and secure their identity verification processes within today’s infrastructure of scanned government IDs and 3rd party verification services.

The primary concerns of today’s businesses conducting identity checks are:

  1. How can I reduce data entry at the point of sale, so that I can increase conversions?
  2. How can I improve user’s trust in my platform?
  3. How can I reduce the likelihood and impact of a data breach, and where possible, my exposure to sensitive data in the first place?

These are the market opportunities — helping governments implement tomorrow’s infrastructure of centralized, blockchain-less digital identities, and helping businesses streamline and secure their identity verification processes within today’s infrastructure.

Roles in the Identity ecosystem

The identity ecosystem is broken up into the following roles:

  • Identity Issuer
  • Identity Owner
  • Identity Provider
  • Identity Verifier
  • Identity Consumer

These roles are not mutually exclusive — for example, the Estonian government acts as both issuer and verifier (due to the use of digital signatures), which also dissolves the need for an Identity Provider.

Within today’s infrastructure, an Identity Provider is typically tasked with collecting identity information (including scanned government IDs) from users (Identity Owners) and mediating the authorisation/sharing of this information with businesses (Identity Consumers), and optionally 3rd party verification services.

Thus, an Identity Provider is also responsible for making the entire process as secure as possible for Identity Owners and Identity Consumers. While Owners and Consumers will always have to implement best practices to protect themselves and their systems, the more work they can outsource to the Identity Provider the better.


2. THREAT MODEL

Here’s a (best-effort) list of everything that could possibly go wrong for an identity provider, owner or consumer, using STRIDE terminology:

Spoofing

  1. An attacker compromises a user’s credentials (mitigated by using strong 2FA)
  2. An attacker compromises a user’s device (mitigated by challenging for password or pin when appropriate)
  3. An attacker creates a false account with a previously stolen identity (no reliable way of identifying a malicious user in this context, and duplicate checking returns false positives for legitimate use cases — more on this in the next section)
  4. An attacker impersonates an Identity Consumer to extract or intercept a user’s profile (mitigated by using trust network e.g. DNSSEC+TLS)
  5. An attacker impersonates an Identity Provider to deliver a backdoored application (mitigated by using digital signatures)
  6. An attacker impersonates a Certification Authority to subsequently impersonate an Identity Consumer or Provider (a problem without a perfect mitigation — more on this in the next section)

Tampering

  1. An attacker issues false updates to a user’s profile (mitigated using digital signatures)
  2. An attacker censors a past update to a user’s profile (mitigated using Merkle trees)
  3. An attacker censors all updates to a user’s profile after a given point (mitigated by using decentralized storage and providing the current Merkle root to an Identity Consumer)
  4. An attacker forces an Identity Provider (silently or coercively) to deliver a backdoored application (very difficult to prevent, efforts should be focussed on detection and/or impact reduction)

Repudiation

  1. An attacker provides different sets of information to two parties that are relying on having the same version of that information e.g. a verifier and a business (mitigated by using plaintext authentication)

Information Disclosure

  1. An attacker gains access to a user’s identity profile, potentially including identity documentation, via the user’s device or the Identity Provider application (mitigated by limited access: controlling what information can be retrieved after being provided by the user)
  2. An attacker gains access to many identity profiles by compromising the identity provider (mitigated by limited knowledge: the use of end-to-end encryption)
  3. An attacker gains access to many identity profiles by compromising an identity consumer (mitigated by limited access and/or knowledge: expiring authorisation and reducing what information is provided in plaintext to the business in the first place)
  4. An attacker gains knowledge about which third parties a user is sharing their information with to direct an attack (mitigated by limited knowledge: metadata encryption, hashing and/or tokenization)
  5. An attacker gains knowledge about what type of information a user is sharing on their profile to support or direct an attack (mitigated by limited knowledge: the use of metadata encryption and/or hashing)

Denial of Service

  1. An attacker prevents a user from using their identity profile by deleting it entirely (mitigated by using decentralized storage)
  2. An attacker prevents all users from using their identities by attacking centralised infrastructure (can only be prevented using decentralized or distributed architecture, read more below)
  3. An attacker prevents a user from using their identity by coercing an identity provider to block their access (mitigated by limited knowledge: protecting user’s identities from the identity provider — more on this in section 4)
  4. An attacker prevents a user from using their identity profile by flooding an append-only data structure with junk data (mitigated using digital signatures)

Elevation of Privilege

  1. And to complete the acronym…an attacker uses vulnerabilities, bug or design flaw to gain elevated access to an identity provider, consumer or user device (mitigated by a long list of best practices)

3. THREAT ANALYSIS

Let’s address the threats that are — potentially — best mitigated by strong decentralized consensus (blockchain).

Censoring all updates to a user’s profile after a given point

Because of the expense of blockchain storage, a person’s full identity profile (and especially identity documents) cannot be stored there — only a hash.

A would-be-censor’s real target is the identity storage, so blockchain does nothing to mitigate against this.

The best mitigation for this attack is storing the profile on the user’s device, and/or storing a Merkle root or hash pointing to encrypted, decentralized, content-addressable storage (e.g. IPFS/IPLD or Swarm).

Launching a DoS attack on centralized architecture

While decentralization (or at least distribution) is the best way to prevent this kind of attack, blockchain is potentially the worst decentralization strategy: not only are blockchains NOT immune to DoS attacks, but anything built on top of them WILL be collateral damage for highly incentivised DoS attackers invested in shorts, competing technology and/or censorship.

Building decentralized smart contracts and apps that interact directly with cryptocurrency, on-chain, is a perfect example of when this trade-off is absolutely worth it.

But don’t believe the FUD. Blockchain is not ready to replace DNS (see below), and the use of blockchain as a general DoS mitigation doesn’t hold water when better services exist: decentralized storage protocols like IPFS or Swarm, P2P protocols like BitTorrent or libp2p, and MASSIVELY distributed private services like CloudFlare.

Attacking a Certification Authority or DNS Provider

This is something that happens in the real world — centralized certification authorities are hacked and used to create fraudulent SSL certificates on a regular basis. And DNS providers are frequently the target of DoS attacks.

But replacing DNSSEC+SSL with blockchain-driven decentralized name service (that also broadcasts cryptographic identities) has unresolved problems: if MyBank registers mybank.com, and then an attacker registers my-bank.com, how is a user supposed to know which site is valid?

This is a problem jointly mitigated by domain registrars and certificate authorities.

Under the current system, once the fraudulent my-bank.com domain is detected, DNS queries will be blocked by the registrar and SSL certificates will be revoked via OCSP and revocation lists. How does a self-sovereign name-to-public-key service protect against such an attack? How does such decentralized name service converge on the “correct” domain name for MyBank, without a central authority?

Automated “similarity scoring” or dispute resolution don’t work, they only help — up until the point that an attacker finds a way to work within the algorithm’s constraints. This is a problem currently affecting automated CAs (Let’s Encrypt and Comodo), and is further exacerbated in a decentralized system where the algorithm is open source.

A web-of-trust could be employed, but this has plenty of its own problems and could arguably be implemented without using a decentralized name service for certificate chain discovery.

The only reliable way to protect against this kind of attack is to use a trusted registrar. The trust cannot be eliminated from this problem, only moved.

And if you absolutely must provide a decentralized name service to your users, give your it to your customers as an option. Support ENS and/or Namecoin alongside regular DNSSEC+SSL, and don’t force a service on your customers that doesn’t provide any protection against the registration of phishing domains, especially without clearly communicating the risks.

And don’t even think about making yourself a certification authority! Sign authorised names and endpoints, for sure, but leave it to existing (and better designed) services to establish cryptographic identities.

Impersonating a User

Relying on a decentralized name service to identify users has even more problems.

A stolen identity is easy to cryptographically report to a decentralized name service — just submit a signed message saying STOLEN. But how can anyone trust your new identity? You can’t just include the new identity in your signed report, because the thief can do exactly the same! Your only option is to broadcast your new identity using — wait for it — trusted, centralized, social networks.

But at least you have the option of cryptographically burning your old identity — how can a user report a lost identity to a decentralized service?

Users are notoriously good at losing their private keys/passwords/devices. How does such a system tell the difference between someone who has had to create a new profile, and someone who is impersonating someone else using identity information they intercepted elsewhere?

You could implement a web of trust on top of the service, but like certification authorities this can be subverted and the mitigations are even more complicated (see the previous section).

Once an identity is stolen, very little can prevent an attacker from fooling the world — and that’s why identity privacy is so important. No amount of crypto can help a system discern between a hopeless user and an impersonated user, and decentralization makes the situation WORSE because lost/stolen identities require creation of a new identities — and the duplicates stick around forever.

The world’s first digital identity, the Estonian e-Residency program, only works because the government is the certification authority and can immediately revoke (and replace) an identity certificate when it is reported as lost or stolen.


4. IDENTITY VS CRYPTOCURRENCY

Now that we’ve explored a threat model and failed to find anything compelling, let’s try to forcibly shoehorn the world of identity into the problems of cryptocurrency — double-spends and censorship.

Double Spends

An identity provider is not responsible for the storage of value — in the world of identity, there are no assets. Everything is a digital copy. The benefit is the ease at which this information can be securely transmitted from identity owners to identity consumers.

Because there is no stored value, there are no “double-spends”, and therefore no need for the strong transaction order guarantees of blockchain.

Censorship

Unlike stolen cryptocurrency coins, an identity profile is trivial to recreate. Annoying for sure, and terribly inconvenient — but vastly more trivial than trying to run a business that cannot receive funds.

Good security controls that protect the privacy of an Identity Provider’s users from the Identity Provider will have a secondary effect of making it very difficult for a censor to prevent a user from recreating their profile.

This means that the impact of a censorship attack has a fixed (and low) ceiling per user, unlike payments where the impact changes relative to the value of a transaction, necessitating strong decentralized consensus: a blockchain.

There are plenty real-world examples of censorship in the traditional payments ecosystem that blockchain definitely solves — look at WikiLeaks — but the real problem with identity is privacy, something that a blockchain does not provide. And this is the real pain for which businesses will pay money to alleviate.


5. IMPLEMENTING BLOCKCHAIN FOR IDENTITY

If you choose to ignore everything above and build a blockchain identity business anyway, you have the following options:

  1. Build (fork?) your own public blockchain
  2. Run your own private blockchain
  3. Leverage an existing public blockchain

Option 1 is pretty much the most ambitious (and perilous) combination of software engineering and marketing anyone can undertake today, whether you are forking or not.

Option 2 massively compromises strong decentralized consensus; places significant trust in a small number of node operators; doesn’t give you more redundancy than existing point-and-click cloud database replication; potentially reduces the privacy of your system (if other controls aren’t used) and increases your system’s attack surface area. Private chains are great for building trust between peers naturally incentivised to collude against each other, but this doesn’t pass any cost/benefit test for identity.

And if you are using an existing public blockchain to “secure” your identity product (the only sane option of the three), you are introducing the following complications:

  • You (and your customers and users) are chained to an update cycle that isn’t in your control and is subject to fierce politics
  • You (and your customers and users) may be impacted by protocol updates that are detrimental to the functional operation of your system
  • You (and your customers and users) WILL be affected by DoS attacks that aren’t specifically targeted at you
  • A proportion of your customers and users WILL be dissuaded from using your service due to political issues affecting your blockchain of choice

And whichever option you choose, you will have to contend with the following:

  • All users must run a standalone or bundled blockchain client (a complicated piece of software) or delegate trust to a 3rd party (weakening the blockchain promise)
  • High blocktime (more than 2–3 seconds) means that the blockchain cannot be used to transmit or validate new or newly updated profiles at the point of sale (100 percent of your users on day one)
  • Low blocktime (2–3 seconds) means you are gambling on a non-POW consensus protocol (and given the enormous sacrifices Vitalik Buterin and Vlad Zamfir have made to solve this problem, you should be worried), or you’re using a private chain with zero blocktime and don’t care about strong decentralized consensus at all (a pointless exercise)
  • Any blocktime is an average: there are many scenarios in which the blocktime can skyrocket — and you’ll be left trying to explain to your customers why your service is suddenly ten times slower.

The blocktime issue can’t be resolved by forcing a level of asynchrony on Identity Consumers (your customers): they will find another provider.

And it can’t be resolved by onboarding users to the Identity Provider ahead of time: the users you successfully convert (and pay for) may not overlap well with the users your customers are targeting, and if your uptake on the customer side (and therefore revenue) does not scale according to your predictions you could end up in a lot of financial woe. The identity market is no different to the authentication market: customers drive user adoption, not the provider.

You could use unconfirmed transactions or a second channel to transmit realtime data — but given the large proportion of users creating a profile at the point of sale, you are seriously weakening the blockchain promise. So, what’s the point?

Once Casper becomes a reality and has been in production for long enough, the blocktime issue will be less significant. But today, it’s not finished, not released, and has definitely not withstood the test of time.


6. THE REAL COST OF BLOCKCHAIN

Maybe you have the right brains on your team and the right level of investment to successfully climb the mountain of additional complexity.

And maybe you’ve found a customer segment that can not only live with the current 15 second average blocktime of a chain like Ethereum, but can specifically benefit from its use.

Most of the attack vectors listed in this post exist whether you use blockchain or not, and the biggest cost of blockchain is the extent to which it will distract your team from successfully executing where it matters most: securing your users from identity theft.


VERDICT

  • Blockchain does not resolve any significant security concern in the context of identity, especially privacy
  • Blockchain name services have serious and unresolved issues with phishing and revocation
  • Blockchain makes everything more complex, which makes a startup more expensive, more likely to be delayed, more likely to fail, more likely to have issues in production and more likely to deploy vulnerabilities
  • If a given identity startup chooses to use blockchain they are more likely to fail at protecting their user’s privacy

ADDENDUM

Here’s a speculative view on the real cause for the blockchain identity trend:

  1. Blockchain entrepreneurs are routinely exposed to the problem space of identity
  2. A handful of these entrepreneurs, who are very comfortable with blockchain development and now view everything as a blockchain problem, see an opportunity and start identity businesses making claims about blockchain being the basis of their system’s security
  3. These claims are validated not by critical analysis and peer review, but by VC funding, media releases and blog posts
  4. Other entrepreneurs see this tidal wave of hype about blockchain identity startups, and assume not only that their claims are valid but also that the best way to gain exposure, funding and traction is to use blockchain in their own identity startups — and then go on to make the same claims
  5. Goto step 3, repeat ad nauseam

logo

Sort:  

Hi! I am a robot. I just upvoted you! I found similar content that readers might be interested in:
https://blog.tokenize.io/the-case-against-blockchain-identity-d3e0aa3faa3

Congratulations @tristanhoy! You have received a personal award!

2 Years on Steemit
Click on the badge to view your Board of Honor.

Do not miss the last post from @steemitboard:
SteemitBoard and the Veterans on Steemit - The First Community Badge.

Do you like SteemitBoard's project? Then Vote for its witness and get one more award!

Congratulations @tristanhoy! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 3 years!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Vote for @Steemitboard as a witness to get one more award and increased upvotes!