Monitoring

We want to know how the website is used and how it is performing. We are concerned about metrics (eg. server cpu and memory usage, file and network io, response time), analytics (page views, user agents), error tracking (eg. exceptions during requests), and logs (server info and warning/error logs). In addition to being able to manually review this information, we also want to be alerted in critical cases (eg. when an exception occurred or when CPU load is too high).

Some of the use cases are best handled by different tools.

Analytics

We are currently self hosting a plausible.io instance at analytics.vp-services.de. Website visits are tracked automatically (in a privacy-conscious manner) via a script that is embedded on each page.

Ask someone to invite you to the project on plausible if you do not already have an account.

Everything else

AppSignal

We are currently evaluating AppSignal as a tool for metrics, error tracking, viewing logs and alerting. It is built and hosted in the EU and seems to have all the monitoring needs one could have combined into one product.

AppSignal collects data using the OpenTelemetry standard. Remix is instrumented (think of it as "wrapped") by opentelemetry-instrumentation-remix to automatically trace requests and actions and to collect server errors.

Frontend errors are tracked by sending errors through the Appsignal frontend client.

See the AppSignal dashboard here. Ask someone to invite you to the VP-Systeme organisation if you are not already a member.

Testing error tracing

Trigger test errors via the /test-server-error and /test-client-error routes.

Seeing traces in development

To see traces in development, you have to configure AppSignal api keys in your shell (.env will not work because it is not loaded in production) and run the production build with npm run build && npm run start.

Possible tracing improvements

We should add fly instance and region information to the traces if we use more than machine at some point in the future.

We could add more data to our traces. For example, we could add the groq query to the our sanity http traces.

Built in fly metrics

Since we're hosted on fly.io, we get built in low level metrics. The metris are collected by prometheus and visualized by Grafana. These features are documented here. View the Grafana dashboard here.

Logging

We can send production logs to AppSignal via fly-log-shipper. Maybe we don't even need logs if we only use tracing instead (ie. sending meaningful data to AppSignal).

Wiki