It is possible to create a faster ECDSA implementation (GPU or CPU) by compromising on security of the private key (which is OK now as the new algorithm does not use the accounts private key), but will such an implementation be useful for anything other than mining?
I experimented a bit with the EC library before the hard fork (just out of curiosity, do not have the hardware to profit from mining).
First, I ported the changes from this secp256k1 fork. The simple modifications like replacing constant-time functions with variable-time equivalents gave about 15% increase in hash rate. After adapting their ecmult implementation (ecmult_big) I got 10% more.
I also replaced the RFC 6979 nonce function with a trivial implementation and it gained me 30% more hash rate in addition to the above, but this hack is not relevant any longer.
Thanks for sharing.
I retried my experiments with the current version — got about 60% increase in hash rate by applying the changes from the fast unsafe secp256k1 fork and enabling GMP. Details and patches are here.