Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Support outputting structured logs in addition to standard logs #8607

Merged
merged 23 commits into from
Oct 29, 2020
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
2db1099
Remove the structured logging configuration code.
clokep Oct 21, 2020
509594f
Remove an unnecessary return value.
clokep Oct 20, 2020
99f50ec
Rework structured logging to use the Python standard library logging …
clokep Oct 21, 2020
fcb74ae
Pipe through the server_name properly.
clokep Oct 21, 2020
24a9882
Use a standard library logger.
clokep Oct 21, 2020
4462758
Update synmark for the changes.
clokep Oct 21, 2020
24ab2df
Update logging format
clokep Oct 26, 2020
5fbc11c
Fix-up formatting using __all__.
clokep Oct 26, 2020
b2fc88b
Do not build an unnecessary set.
clokep Oct 26, 2020
650bb09
Stop using the DEBUG decorators.
clokep Oct 26, 2020
7115697
Raise an error if structured is in the logging config.
clokep Oct 26, 2020
e855dbb
__all__ takes strings, not objects.
clokep Oct 26, 2020
7071d89
Update the sample config.
clokep Oct 26, 2020
6b785c1
Revamp tests a bit to avoid impacting other tests.
clokep Oct 26, 2020
8d51476
Abstract handling of loggers in tests.
clokep Oct 27, 2020
7fb5505
Add a test for including additional structured data.
clokep Oct 27, 2020
11a488c
Lint.
clokep Oct 27, 2020
a19c967
Fix test after rename.
clokep Oct 27, 2020
e98a6d1
Add an upgrade note.
clokep Oct 27, 2020
babdd5b
Rework the code to load logging configs.
clokep Oct 28, 2020
10738cc
Convert legacy drain configurations to standard library handler configs.
clokep Oct 28, 2020
f801d71
Add back a JSON formatter without time.
clokep Oct 28, 2020
1c0181a
Fix type hints.
clokep Oct 28, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions UPGRADE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,20 @@ for example:
wget https://packages.matrix.org/debian/pool/main/m/matrix-synapse-py3/matrix-synapse-py3_1.3.0+stretch1_amd64.deb
dpkg -i matrix-synapse-py3_1.3.0+stretch1_amd64.deb

Upgrading to v1.23.0
====================

Structured logging configuration breaking changes
-------------------------------------------------

This release introduces a backwars-incompatible change in the configuration of
structured logging. If your logging configuration contains ``structured: true``
then it must be modified based on the `structured logging document
<https://github.com/matrix-org/synapse/blob/master/docs/structured_logging.md>`_.

``drains`` are now an unused option and should be replaced by standard logging
configuration of ``handlers`` and ``formatters`.

Upgrading to v1.22.0
====================

Expand Down
1 change: 1 addition & 0 deletions changelog.d/8607.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Re-organize the structured logging code to separate the TCP transport handling from the JSON formatting.
4 changes: 4 additions & 0 deletions docs/sample_log_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@
# This is a YAML file containing a standard Python logging configuration
# dictionary. See [1] for details on the valid settings.
#
# Synapse also supports structured logging for machine readable logs which can
# be ingested by ELK stacks. See [2] for details.
#
# [1]: https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema
# [2]: https://github.com/matrix-org/synapse/blob/master/docs/structured_logging.md

version: 1

Expand Down
113 changes: 53 additions & 60 deletions docs/structured_logging.md
Original file line number Diff line number Diff line change
@@ -1,83 +1,76 @@
# Structured Logging

A structured logging system can be useful when your logs are destined for a machine to parse and process. By maintaining its machine-readable characteristics, it enables more efficient searching and aggregations when consumed by software such as the "ELK stack".
A structured logging system can be useful when your logs are destined for a
machine to parse and process. By maintaining its machine-readable characteristics,
it enables more efficient searching and aggregations when consumed by software
such as the "ELK stack".

Synapse's structured logging system is configured via the file that Synapse's `log_config` config option points to. The file must be YAML and contain `structured: true`. It must contain a list of "drains" (places where logs go to).
Synapse's structured logging system is configured via the file that Synapse's
`log_config` config option points to. The file should include a formatter which
uses the `synapse.logging.TerseJsonFormatter` class included with Synapse and a
handler which uses the above formatter.

A structured logging configuration looks similar to the following:

```yaml
structured: true
version: 1

formatters:
structured:
class: synapse.logging.TerseJsonFormatter

handlers:
file:
class: logging.handlers.TimedRotatingFileHandler
formatter: structured
filename: /path/to/my/logs/homeserver.log
when: midnight
backupCount: 3 # Does not include the current log file.
encoding: utf8

loggers:
synapse:
level: INFO
handlers: [remote]
synapse.storage.SQL:
level: WARNING

drains:
console:
type: console
location: stdout
file:
type: file_json
location: homeserver.log
```

The above logging config will set Synapse as 'INFO' logging level by default, with the SQL layer at 'WARNING', and will have two logging drains (to the console and to a file, stored as JSON).

## Drain Types

Drain types can be specified by the `type` key.

### `console`

Outputs human-readable logs to the console.

Arguments:

- `location`: Either `stdout` or `stderr`.

### `console_json`

Outputs machine-readable JSON logs to the console.

Arguments:

- `location`: Either `stdout` or `stderr`.

### `console_json_terse`
The above logging config will set Synapse as 'INFO' logging level by default,
with the SQL layer at 'WARNING', and will log to a file, stored as JSON.

Outputs machine-readable JSON logs to the console, separated by newlines. This
format is not designed to be read and re-formatted into human-readable text, but
is optimal for a logging aggregation system.
It is also possible to figure Synapse to log to a remote endpoint by using the
`synapse.logging.RemoteHandler` class included with Synapse. It takes the
following arguments:

Arguments:

- `location`: Either `stdout` or `stderr`.

### `file`

Outputs human-readable logs to a file.

Arguments:

- `location`: An absolute path to the file to log to.

### `file_json`

Outputs machine-readable logs to a file.
- `host`: Hostname or IP address of the log aggregator.
- `port`: Numerical port to contact on the host.
- `maximum_buffer`: (Optional, defaults to 1000) The maximum buffer size to allow.

Arguments:
A remote structured logging configuration looks similar to the following:

- `location`: An absolute path to the file to log to.
```yaml
version: 1

### `network_json_terse`
formatters:
structured:
class: synapse.logging.TerseJsonFormatter

Delivers machine-readable JSON logs to a log aggregator over TCP. This is
compatible with LogStash's TCP input with the codec set to `json_lines`.
handlers:
remote:
class: synapse.logging.RemoteHandler
formatter: structured
host: 10.1.2.3
port: 9999

Arguments:
loggers:
synapse:
level: INFO
handlers: [remote]
synapse.storage.SQL:
level: WARNING
```

- `host`: Hostname or IP address of the log aggregator.
- `port`: Numerical port to contact on the host.
The above logging config will set Synapse as 'INFO' logging level by default,
with the SQL layer at 'WARNING', and will log JSON formatted messages to a
remote endpoint at 10.1.2.3:9999.
2 changes: 1 addition & 1 deletion scripts-dev/lint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ else
# then lint everything!
if [[ -z ${files+x} ]]; then
# Lint all source code files and directories
files=("synapse" "tests" "scripts-dev" "scripts" "contrib" "synctl" "setup.py")
files=("synapse" "tests" "scripts-dev" "scripts" "contrib" "synctl" "setup.py" "synmark")
fi
fi

Expand Down
48 changes: 23 additions & 25 deletions synapse/config/logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import yaml

from twisted.logger import (
ILogObserver,
LogBeginner,
STDLibLogObserver,
eventAsText,
Expand All @@ -32,11 +31,8 @@

import synapse
from synapse.app import _base as appbase
from synapse.logging._structured import (
reload_structured_logging,
setup_structured_logging,
)
from synapse.logging.context import LoggingContextFilter
from synapse.logging.filter import MetadataFilter
from synapse.util.versionstring import get_version_string

from ._base import Config, ConfigError
Expand All @@ -48,7 +44,11 @@
# This is a YAML file containing a standard Python logging configuration
# dictionary. See [1] for details on the valid settings.
#
# Synapse also supports structured logging for machine readable logs which can
# be ingested by ELK stacks. See [2] for details.
#
# [1]: https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema
# [2]: https://github.com/matrix-org/synapse/blob/master/docs/structured_logging.md

version: 1

Expand Down Expand Up @@ -176,9 +176,9 @@ def generate_files(self, config, config_dir_path):
log_config_file.write(DEFAULT_LOG_CONFIG.substitute(log_file=log_file))


def _setup_stdlib_logging(config, log_config, logBeginner: LogBeginner):
def _setup_stdlib_logging(config, log_config, logBeginner: LogBeginner) -> None:
"""
Set up Python stdlib logging.
Set up Python standard library logging.
"""
if log_config is None:
log_format = (
Expand All @@ -196,6 +196,15 @@ def _setup_stdlib_logging(config, log_config, logBeginner: LogBeginner):
handler.setFormatter(formatter)
logger.addHandler(handler)
else:
# If the old structured logging configuration is being used, raise an
# error.
if "structured" in log_config and log_config.get("structured"):
raise ConfigError(
"The `structured` parameter is no longer supported, see the documentation "
"for enabling structured logging: "
"https://github.com/matrix-org/synapse/blob/master/docs/structured_logging.md"
)

logging.config.dictConfig(log_config)

# We add a log record factory that runs all messages through the
Expand All @@ -204,12 +213,14 @@ def _setup_stdlib_logging(config, log_config, logBeginner: LogBeginner):
# filter options, but care must when using e.g. MemoryHandler to buffer
# writes.

log_filter = LoggingContextFilter(request="")
log_context_filter = LoggingContextFilter(request="")
log_metadata_filter = MetadataFilter({"server_name": config.server_name})
old_factory = logging.getLogRecordFactory()

def factory(*args, **kwargs):
record = old_factory(*args, **kwargs)
log_filter.filter(record)
log_context_filter.filter(record)
log_metadata_filter.filter(record)
return record

logging.setLogRecordFactory(factory)
Expand Down Expand Up @@ -255,8 +266,6 @@ def _log(event):
if not config.no_redirect_stdio:
print("Redirected stdout/stderr to logs")

return observer


def _reload_stdlib_logging(*args, log_config=None):
logger = logging.getLogger("")
Expand All @@ -269,7 +278,7 @@ def _reload_stdlib_logging(*args, log_config=None):

def setup_logging(
hs, config, use_worker_options=False, logBeginner: LogBeginner = globalLogBeginner
) -> ILogObserver:
) -> None:
"""
Set up the logging subsystem.

Expand All @@ -282,9 +291,6 @@ def setup_logging(

logBeginner: The Twisted logBeginner to use.

Returns:
The "root" Twisted Logger observer, suitable for sending logs to from a
Logger instance.
"""
log_config = config.worker_log_config if use_worker_options else config.log_config

Expand All @@ -303,20 +309,12 @@ def read_config(*args, callback=None):

log_config_body = read_config()

if log_config_body and log_config_body.get("structured") is True:
logger = setup_structured_logging(
hs, config, log_config_body, logBeginner=logBeginner
)
appbase.register_sighup(read_config, callback=reload_structured_logging)
else:
logger = _setup_stdlib_logging(config, log_config_body, logBeginner=logBeginner)
appbase.register_sighup(read_config, callback=_reload_stdlib_logging)
_setup_stdlib_logging(config, log_config_body, logBeginner=logBeginner)
appbase.register_sighup(read_config, callback=_reload_stdlib_logging)

# make sure that the first thing we log is a thing we can grep backwards
# for
logging.warning("***** STARTING SERVER *****")
logging.warning("Server %s version %s", sys.argv[0], get_version_string(synapse))
logging.info("Server hostname: %s", config.server_name)
logging.info("Instance name: %s", hs.get_instance_name())

return logger
20 changes: 20 additions & 0 deletions synapse/logging/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# -*- coding: utf-8 -*-
# Copyright 2020 The Matrix.org Foundation C.I.C.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# These are imported to allow for nicer logging configuration files.
from synapse.logging._remote import RemoteHandler
from synapse.logging._terse_json import TerseJsonFormatter

__all__ = ["RemoteHandler", "TerseJsonFormatter"]
Loading