Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor system #12

Open
9 of 11 tasks
str opened this issue Jun 9, 2020 · 7 comments
Open
9 of 11 tasks

Monitor system #12

str opened this issue Jun 9, 2020 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@str
Copy link
Member

str commented Jun 9, 2020

Multiple times, we've seen eosdac-client not showing the right data. We think it's an issue with the API. We've seen some times the API has been down, or not updating the data, and other times we think it's because of too much traffic in main net.

A proposal was to have a monitor system to track when the API is up or down, but we agreed that just tracking that is not enough as the API could be up but outdated. We need to:

  • create a server/vps to monitor the API
  • trigger notifications via email
  • trigger notifications via discord
  • add new endpoints to track more information, like last block read (@michaeljyeates)
  • create multiple metrics about the API
  • track state in jungle
  • increase the interval between tests
  • track state in mainet
  • add a default filter to block too many results in query
  • solve issue with API getting down
  • see if msigs not showing updated msig count #11 is related
@str str self-assigned this Jun 9, 2020
@str str added this to the dacfactory-v1.0 milestone Jun 9, 2020
@str
Copy link
Member Author

str commented Jun 9, 2020

So, to document, today we saw that

  • msigs had outdated information about the signatures per msig
  • later some msigs were updated with the right signatures count, but others, that were signed at the same time weren't
  • later the executed msigs were still shown as open

image

@str
Copy link
Member Author

str commented Jun 11, 2020

I've just confirmed a new custodian just signed all proposals, but the dacclient still shows the wrong numbers. API data still has wrong info.

image

@str
Copy link
Member Author

str commented Jun 11, 2020

I'm now tracking if the API returns any data at all. That now tracks if the API is up or down.

image

But that gives us not much detail about how old the data in the API is.

@str
Copy link
Member Author

str commented Jun 14, 2020

OK, mail notifications enabled

image
image
image

@str
Copy link
Member Author

str commented Jun 14, 2020

OK, now notifications of API down are sent to tech channel

image

@str str closed this as completed Jun 17, 2020
@str str added the bug Something isn't working label Jun 19, 2020
@str
Copy link
Member Author

str commented Jun 19, 2020

Mainnet API has not been answering.

image

We now have more information confirming the API is down

time curl -v -I https://api.eosdac.io/v1/eosdac/msig_proposals
*   Trying 79.137.68.20:443...
* TCP_NODELAY set
* Connected to api.eosdac.io (79.137.68.20) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=eu.eosdac.io
*  start date: May 11 17:43:45 2020 GMT
*  expire date: Aug  9 17:43:45 2020 GMT
*  subjectAltName: host "api.eosdac.io" matched cert's "api.eosdac.io"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55cb00af4db0)
> HEAD /v1/eosdac/msig_proposals HTTP/2
> Host: api.eosdac.io
> user-agent: curl/7.68.0
> accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 404 
HTTP/2 404 
< access-control-allow-origin: *
access-control-allow-origin: *
< content-type: application/json; charset=utf-8
content-type: application/json; charset=utf-8
< content-length: 60
content-length: 60
< date: Fri, 19 Jun 2020 22:00:32 GMT
date: Fri, 19 Jun 2020 22:00:32 GMT
< 
* Connection #0 to host api.eosdac.io left intact
real	1m48.382s
user	0m0.005s
sys	0m0.009s

@str str reopened this Jun 19, 2020
@str
Copy link
Member Author

str commented Jun 22, 2020

We've been monitoring for 1h and it looks better now:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant