We believe that the best way to build software is to do it in close collaboration with the people who use it. We invite you to submit your ideas using the form below. Please be sure to include the problem for which you are solving and the benefits of implementing the idea.
We do our best to implement as many Ideas as we can. Our Product team will evaluate all submitted ideas in a timely manner and will disposition each into one of the following categories: will integrate into the product roadmap, further research is needed, unlikely to implement.
Thanks for collaborating with us!
When an Automate HA cluster is failed over, the postgres cluster logging is minimal to non-existent.
The logging situation overall, for all modules should improve to the point of usability before Automate HA is marked GA.
For examples of good logging, see a standalone Chef Server's logging output
Or Chef Backend. For example, the replication lag indications in the logs
Automate HA pglogs from backend psql nodes are not captured by chef-automate gatherlog bundles. These logs are pretty essential to troubleshooting.
On live system they are in directory:
/hab/svc/automate-ha-postgresql/var/pg_log
In the gatherlog bundle, the
/hab/svc/automate-ha-postgresql/var/
directory is not captured:/hab/svc/automate-ha-postgresql > ls
total 0
drwxr-x---@ 15 user 1083951318 480 20 Mar 14:32 config
drwxr-xr-x@ 12 user 1083951318 384 20 Mar 14:32 logs
Attachments Open full size
Proper postgresql logs would be ideal as well. Can we ship PSQL with logging setup?
https://www.loggly.com/use-cases/postgresql-logs-logging-setup-and-troubleshooting/
Attachments Open full size
Debug logging that applies to all services
Actual debug logging
Replication lag, so customers can choose the correct system to fail over to, or none at all
Per request output for services. This could be flagged by debug logging
Attachments Open full size