*Note: This review and score is purely based on the information disclosed by the validator service and the scoring rubric.
Last Updated: Oct 6, 2019
Certus.One was founded by a team of experts with a focus on security. The team was one of the winners of Game of Stakes and is currently a top 3 validators in the Cosmos network.
Team Background (100/100)
- Full-Time/Part-Time (10/10)
- Prior Blockchain Dev/Impact (10/10)
- Systems Experience (10/10)
- Recognizability (10/10)
Current Voting Power (63/100)
- Total Staked: (9/10)
- Unique Self-Bonders: (10/10)
- Commissions: (0/10)
Historical Metrics (60/100)
- Uptime (8/10)
- Proposals (4/10)
- Legal Compliance/Insurance (+5)
- Innovations (+5)
Certus.One was co-founded by two individuals, Hendrik Hofstadt and Leopold Schabel. Combined, they have deep expertise in blockchain technology and building highly available/performant infrastructure. Leopold has a background in system administration and web hosting. He has worked on the largest DDoS software used by banks in Germany. Hendrik specializes in software engineering and security, with about 3 years of background in blockchain; Ethereum and interchain communications. He’s very interested in Proof-of-Stake systems, drawing him to the Cosmos ecosystem. Hendrik has a complete understanding of the Cosmos SDK and to date has reported 7 bugs to the Cosmos system, 2 of which were critical bugs in the token minting process. Both Hendrik and Leopold met in one of Europe’s largest cybersecurity challenges.
The team takes a depth versus breadth approach with Certus.One. It was stated in our discussion that one of Certus.One’s differentiators is their technical expertise and excellence. They like to be involved in the governance proposals in the networks they validate in. While the team currently has a deep understanding of Cosmos validation, they plan to expand further into other blockchains.
Certus.One is a major validator for both the CosmosHub and IRISnet blockchains. The service has ~9M atoms staked on the cosmos hub, with a large spread of delegators. Their total stake gives the service a ~6% voting power in the network. On the IRISnet blockchain, the service has ~12M iris staked, translating to ~2% in voting power. However on the IRISnet blockchain, Certus.One only has 4 delegators, with 95.85% of their stake coming from a single address.
Certus.One has maintained a highly available validator service. The service has been registered since the genesis block on the Cosmos network and has recorded a 100% uptime. The Hubble block explorer does show the service missing some pre-commits at the end of April, 2019 on the CosmosHub and end of May, 2019 for IRISnet.
Certus.One was the winner of Game of Stakes before the Cosmos launch. Zaki Manian, Director of Tendermint Labs, published a blog post, “Game of Stakes Closing Ceremonies”. In the post, Zaki commends Certus.One for their incredible accomplishments with “the most precommits, flawless uptime, and leading the GoS6 hard fork”. The team also provided a fix for a state machine bug in the first GoS network.
Certus.One has also published their own Validator Operations Guide, to assist new and smaller validators in their set up. The guide is very comprehensive, listing good practices for System Design/Engineering, Tendermint P2P layer, HSMs, Validator High-Availability, and more. In our discussion, Hendrik stated that small validators should be able to run a setup as secure as a top 5 validators.
While the Certus.One website currently does not reflect this, it was stated in our discussion that their service has a focus on compliance. The team provides custom accounting solutions for delegators and also abides by a Rewards SLA (service level agreements). The team will compensate delegators for missed rewards if the agreement is broken. Certus.One’s website will be updated in the future to provide more information.
Outside of the SLA, Certus.One does not insure their delegators in the event of slashing. The team is actively looking into adding slashing terms to the agreement.
- Failover (30/30)
- Private Peering (10/10)
- Agreements with other Validators (10/10)
- Sentry Scaling (10/10)
- Backup Strategy
Certus.One is one of the few validators to implement an active/active setup for their validator nodes. Both nodes are secured in two different data centers. One location is disclosed to be in Frankfort, Germany while the second location is secret. The validators sync on signing blocks using coordinators to establish consensus and a distributed lock across cloud providers. DDoS protection offered by the data center is set up in front of the validator nodes. The protection software the data center provides is the same software Certus.One’s co-founder, Leopold worked on before Certus.One.
In our discussion, Hendrik stated in the next couple of months, Certus.One will be releasing their “third-gen” architecture for their validator setup. The focus of this new architecture is a highly available and distributed key management system, with the keys secured in an enclave and added double-sign protection. With this upcoming setup, Hendrik stated that there will no longer be a need to maintain standalone validators nodes or have distinction with sentries. All nodes can be deployed using cloud providers.
Certus.One currently uses the standard sentry based architecture surrounding their validator nodes. The nodes are deployed amongst different cloud providers, with some privately peered with selected validators. In our discussion, it was mentioned that scaling up sentries to mitigate DDoS attacks is a valid method but can be costly and is not efficient. The team makes use of alternative methods versus autoscaling. In addition to the normal sentries that accept both inbound/outbound traffic, there are dedicated sentries that are outbound only, with inbound connections dropped at the firewall level. The team has included a writeup of this design and the reasoning behind it in their validator knowledge base.
Regular snapshots are taken of their blockchain data to ease the deployment of new nodes and to avoid re-syncing from genesis.
Certus.One’s technical expertise has lead them to create a lot of custom tooling around observability. In addition to their validator service, the team also provides an api service, Stargazer. Stargazer offers auto-scaling api/rpc hosts for tendermint based chains. Their blockchain data is regularly indexed into an sql database to enable complex queries.
Monitoring Tools (83 /100)
- Network Level (10/10)
- Hardware Level (5/10)
- Paging (10/10)
Single Point of Failure (100/100)
- Multi-Cloud (10/10)
- Multi-Region (10/10)
Key Management (75/100)
- HSM Selections (10/10)
- Smart Key Management (5/10)
Validator Access (100/100)
- Physical/Remote (10/10)
Certus.One uses standard monitoring tools such as Prometheus in addition to an on-call rotation to maintain their systems. The team also implemented their own custom solution for notifications in case Pagerduty fails. In addition to the metric used to monitor the health of their servers, the team also watches network metrics like a chain halt and changing block times.
In our discussion, Hendrik stressed the importance of only paging staff at needed times. There are notable incidents of failures that occur within large corporations because of individuals who turn off alerting systems because they are “too annoying”. Often times, many errors can be ignored and do not require immediate attention (especially in the middle of the night). A high “signal-to-noise” ratio must be maintained with the number of pages sent out. Certus.One implements symptoms-based reporting and alerting. The team has posted a detailed writeup of their monitoring practice in their knowledge base.
Single Points of Failure
Certus.One uses uses a variety of cloud providers for their sentry nodes and regionally different data centers for both active validators nodes. There is no clear single point of failure in their architecture outside of host access to the validator machines.
Certus.One uses the YubiHSM. Hardware-based double sign protection is not possible on the YubiHSM. The team currently utilizes the Tendermint KMS to provide software based double-sign protection. The Ledger is an alternative option that has hardware based double-sign protection. However, Hendrik noted that the Ledger is not an enterprise grade solution and should be avoided.
The validator machines in both data centers can be accessed physically within 1 hour if required. The machines can also be accessed remotely using SSH. This access is further restricted with key-based 2FA.
As outlined in the validator architecture, the team utilizes outbound-only sentry nodes and a global TCP proxy to provide a level of defense against DoS attacks.
Certus.One third-generation key management solution is the big focus for the team. It was stated in our discussion that they believe validation services alone will not scale forever. Selling this new KMS solution at an affordable rate is apart of the team’s plan moving forward. Hendrik had a nice comment in our discussion. He stated that validation should be affordable with a low barrier of entry for new and smaller validators. With their upcoming KMS solution, anyone will be able to setup a validation service with the same security as a top-5 validator.