Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate audit logs from server logs. #27443

Merged
merged 1 commit into from
Mar 24, 2017
Merged

Conversation

sharidas
Copy link
Contributor

@sharidas sharidas commented Mar 21, 2017

This change will help users to separate logs
for each apps based on conditional logging.

Signed-off-by: Sujith H sharidasan@owncloud.com

Description

This change will help users to separate the audit logs from the server log. So lets say for each apps we can have conditional logs having separate log files.

Related Issue

Motivation and Context

This will help users to have separate logs for each apps. Instead of searching the entire server log, it would be better to separate the logs.

How Has This Been Tested?

Made the modifications in the files_texteditor and gallary app as follows:

https://paste.fedoraproject.org/paste/MF7RzFfirSFFfKYciC1UYl5M1UNdIGYhyRLivL9gydE=
Modify the config.php as follows:

  'log.condition' => [
    [
      'users' => ['user1'],
      'apps' => ['files_texteditor'],
      'logfile' => '/tmp/test.log'
    ],
    [
      'users' => ['user1'],
      'apps' => ['gallery'],
      'logfile' => '/tmp/gallery.log'
    ],
  ], 

After this try to open a text file in the files app. User should be able to see the logs written in /tmp/test.log
Similarly try to open the gallery with photos. User should be able to see the logs written in /tmp/gallery.log

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@mention-bot
Copy link

@sharidas, thanks for your PR! By analyzing the history of the files in this pull request, we identified @DeepDiver1975, @bartv2 and @oparoz to be potential reviewers.

@mmattel
Copy link
Contributor

mmattel commented Mar 21, 2017

Pls do also a documentation in config.sample.php
Q: is this for owncloud log only or does that apply to the syslog method too?

@DeepDiver1975 DeepDiver1975 added this to the 10.0 milestone Mar 21, 2017
@DeepDiver1975
Copy link
Member

Stacktrace

Test\LoggerTest::testAppCondition
Failed asserting that two arrays are equal.
--- Expected
+++ Actual
@@ @@
 Array (
-    0 => '1 Show info messages of files app'
-    1 => '2 Show warning messages of other apps'
+    0 => '2 Show warning messages of other apps'
 )

@sharidas
Copy link
Contributor Author

@mmattel Sure I will update the config.sample.php too. This patch tries to separate the logs for apps separate from owncloud log.

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch from d5c10e3 to 9d3c5f5 Compare March 21, 2017 13:25
@jvillafanez
Copy link
Member

🚫 Personally, I'm not willing to merge this piece of code. I think there are better and cleaner alternatives to achive the same objective.

  • We can't guarantee that the apps will follow the approach we want. This means that a lot of apps won't work with these changes (the app will work but it won't log the way we want)
  • I expect a huge logCondition if we want all the apps to log in different files. It doesn't seem a maintenable approach even with vanilla installations.
  • There is a mix in the responsabilities: either the app decides what to log and under what conditions (leaving out core for this), or it's core the one filtering the logs however it wants.
  • Apps shouldn't care at all about where they're logging.

First, I'd add support to use several log handlers at the same time. Then we can add a new log handler (AppBasedLog, for example) that would write any log entry to the corresponding file based on the app. Additional optional configuration could be added to the new handler: directory location for the logs, user filters, etc.

I'd keep this on hold for now.

@PVince81
Copy link
Contributor

@jvillafanez while I agree that your proposed approach is significantly better, it will require rewriting / rewiring pieces of apps to make sure they are using the correct app-specific logger. And if apps can still use the same wrong logger we're back in a similar situation like we are now where apps need to specific for every message to which log it needs to go. (the difference in your approach being that there is only a single location where the app should pay attention which logger they get / use per DI while the current situation is per log entry).

I expect a huge logCondition if we want all the apps to log in different files. It doesn't seem a maintenable approach even with vanilla installations.

This is true. Might be possible to generate the config using a script but it's inelegant.

@jvillafanez do you have an estimate how much time it would take to go the proper route you proposed (and a quick plan) to compare with this PR's quick, inelegant but working approach.

@PVince81
Copy link
Contributor

Would also be good to hear from @butonic about #27443 (comment) as he's the one who proposed more of the logging flexibility stuff.

cc @pmaier1

@butonic
Copy link
Member

butonic commented Mar 21, 2017

@jvillafanez I don't fully understand:

We can't guarantee that the apps will follow the approach we want. This means that a lot of apps won't work with these changes (the app will work but it won't log the way we want)

Apps should follow our current logging interface and use the context parameter to set a value for 'app'. The current logging uses that and even logs 'no app in context' if it is forgotten. Without configuring any log.condition nothing changes from what is currently happening. everything lands in data/owncloud.log, which is what you want anyway if you are using a log aggregator like splunk or elk.

I expect a huge logCondition if we want all the apps to log in different files. It doesn't seem a maintenable approach even with vanilla installations.

Why would we want that? I think the most prominent use case is to log certain apps to a dedicated logfile, eg user_ldap or wnd because they add a lot of noise. Personally, I think splitting is bad because you might loose information if not all logs are sent to you. In any case the log.condition can be put in a separate logging.config.php

There is a mix in the responsabilities: either the app decides what to log and under what conditions (leaving out core for this), or it's core the one filtering the logs however it wants.

Coming from Java I personally like what most logging frameworks do there. Set a global log level plus set dedicated log levels per package and class. That is pretty flexible and allows to turn of logs that you are not interested it. It would be cool if you could split the log into two files, one with the stuff you think is relevant (the filtered log) and one with the full debug log ... but then again tail and grep are your friends.

Apps shouldn't care at all about where they're logging.
totally agree ... but I don't see why this PR is a problem for this?

IMO the bigger problem is that we sometimes concatenate a lot of strings that are then not logged. I'd like to use a closure to only evaluate the code when needed. but the PSR does not seem to allow that. We could probably add a closure to the context for complicated stuff and try to encourage developers to use only plain strings in the log message.

@jvillafanez
Copy link
Member

We can't guarantee that the apps will follow the approach we want. This means that a lot of apps won't work with these changes (the app will work but it won't log the way we want)

Apps should follow our current logging interface and use the context parameter to set a value for 'app'. The current logging uses that and even logs 'no app in context' if it is forgotten. Without configuring any log.condition nothing changes from what is currently happening. everything lands in data/owncloud.log, which is what you want anyway if you are using a log aggregator like splunk or elk.

I got confused. I read "modifications in apps" and said "this is bad". If apps don't need any change to be able to use these changes, that's fine.

I expect a huge logCondition if we want all the apps to log in different files. It doesn't seem a maintenable approach even with vanilla installations.

Why would we want that? I think the most prominent use case is to log certain apps to a dedicated logfile, eg user_ldap or wnd because they add a lot of noise. Personally, I think splitting is bad because you might loose information if not all logs are sent to you. In any case the log.condition can be put in a separate logging.config.php

I'm sure there will be people who will split the files just because it's possible. I think it's fine as long as we also have the full log: for small things or first contact, it might be useful to check small files based on each component or just for specific components.

My main point is that we should allow admins to configure several log handlers. Right now they have the possiblity to use syslog or the ownCloud log, only one of them. By letting the admin choose several of them we can easily extend the logging capabilities.
The secondary point is to create a new handler to apply these changes. This handler will take care of filtering and creating the required log files if needed.

The advantages:

  • more flexibility for the admin to choose the log handlers he wants to use. This could include other file formats at some point. On the short term, the expected handlers would be the current ownCloud log and the new one with some filters. This way we can have the full log AND the filtered log.
  • less risk changing code. By adding a new handler the risk of a regression is greatly reduced because the old code won't be changed. If something breaks it's possible to fallback to the old one.

Coming from Java I personally like what most logging frameworks do there. Set a global log level plus set dedicated log levels per package and class. That is pretty flexible and allows to turn of logs that you are not interested it

That could be another log handler with those capabilities.

@jvillafanez
Copy link
Member

@jvillafanez do you have an estimate how much time it would take to go the proper route you proposed (and a quick plan) to compare with this PR's quick, inelegant but working approach.

I'd say 1-2 days to support multiple handlers (it might take a bit more time because we might need to verify that the whole code uses the logger configured by ownCloud) and another 1-2 to create the new handler with the changes (it should be mostly a copy & paste of the current code). I'm not sure if it will be possible to have unittest (1 day for some basic tests?), and maybe 1 day for QA.
5 days should be enough to have something working properly.

'users' => ['sample-user'],
'apps' => ['files'],
'logfile' => '/tmp/test.log'],
['users' => ['user1'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharidas can you format this better with opening and closing brackets on the other row?

[
    'key' => [
        'subkey' => 'value',
        'subkey2' => 'value2',
    ],
]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I have updated PR with a format better than previous one.

@PVince81
Copy link
Contributor

@pmaier1 see #27443 (comment)

As we're running out of time I suggest keeping this PR for the short term (it's working and close to completion) and considering @jvillafanez's proposal for later. And for that also have a proper design session.

The current approach in this PR is an extension of something that already exists, so a quick addition.

$handle = @fopen($conditionalLogFile, 'a');
@chmod($conditionalLogFile, 0640);
} else {
$handle = @fopen(self::$logFile, 'a');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, did we really do a fopen on the log file for every single entry ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah.

Copy link
Member

@butonic butonic Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm what would happen if we keep the handle open until the request ends? I can only guess how that behaves when multiple processes write log messages. hmmm monolog also uses the file handle for the whole request (unless you call close()): https://github.com/Seldaek/monolog/blob/master/src/Monolog/Handler/StreamHandler.php#L95-L111 It seems the underlying fwrite implementation is thread safe: http://stackoverflow.com/a/2220574

so we should be able to optimize our log writing ... @mrow4a ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quoting official php documentation:

(http://php.net/manual/en/function.fwrite.php)

If handle was fopen()ed in append mode, fwrite()s are atomic (unless the size of string exceeds the filesystem's block size, on some platforms, and as long as the file is on a local filesystem). That is, there is no need to flock() a resource before calling fwrite(); all of the data will be written without interruption.

(http://php.net/manual/en/function.fseek.php)

If you have opened the file in append (a or a+) mode, any data you write to the file will always be appended, regardless of the file position, and the result of calling fseek() will be undefined.

I'd keep the file opened and lock the file when writing (we can't guarantee the log entry to have less than 1k). In addition I've read somewhere the writes over NFS aren't atomic, so that another reason to lock the file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't that block other requests from writing log messages until the lock is released?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but hopefully it won't be dramatic. The flow would be: open file to append and when we want to write a log entry, lock the file for writing - write - unlock. There is no need to open and close the file over and over.

We could add an option to use locks or not over the log file if there is a noticeable performance impact. I expect some overhead, but hopefully not much.

Copy link
Member

@butonic butonic Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I fear locking will kill performance. What does google say? The log4php guys have a benchmark ... it is from 2009, but we could rerun it on todays php to see the numbers: https://www.grobmeier.de/performance-ofnonblocking-write-to-files-via-php-21082009.html

running 4 scripts in parralel in the browser with 100000 loops in my vm with php 5.6 on debian produces:

Execution with NOT closing the log file took 5.9572031497955 seconds
Execution with CLOSING the log after each write file took 17.085966825485 seconds
Execution with file_put_content took 16.009223937988 seconds
Execution with leaving the file open, but LOCKING and UNLOCKING it took 5.5897269248962 seconds
Execution with nonblocking stream took 6.125638961792 seconds
Execution with the error_log method took 14.576776981354 seconds

so I think locking is the way to go ... in a different pr.

@sharidas sorry for highjacking ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think locking is the way to go ... in a different pr.

Yes, it's outside of scope for this PR. No need to change it right now.

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch from 9d3c5f5 to 344c1cd Compare March 22, 2017 04:41
@@ -297,7 +307,7 @@ public function log($level, $message, array $context = []) {

if ($level >= $minLevel) {
$logger = $this->logger;
call_user_func([$logger, 'write'], $app, $message, $level);
call_user_func([$logger, 'write'], $app, $message, $level, $logConditionFile);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense that the "global" logger (aka, this class) tells the specific loggers where they have to write? Is it useful for any other logger (syslog, for example)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impact on syslog - this was one of my questions above: #27443 (comment)
Not to complain if so, but necessay to know and document.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the syslog logger will ignore the logfile condition and always log to syslog.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raises the question: shall this be enabled for syslog too, would that make sense?

In any case it needs a note in config.sample.php and/or the documentation.
aka: Not applicapable when using syslog, or This also splits logs when using syslog

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@butonic can you check if logging to syslog will ignore logfile conditions?
This is important regarding proper documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the checks for the different logs files directly into the ownCloud log and keep the same "interface" for all the loggers. Basically, the ownCloud logger would read the log.conditions configuration and initialize itself properly (if possible).

$config = \OC::$server->getSystemConfig();

// default to ISO8601
$format = $config->getValue('logdateformat', 'c');
$logTimeZone = $config->getValue( "logtimezone", 'UTC' );
$handle = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems unneeded

Copy link
Member

@butonic butonic Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. $handle will be set to something in 116 or 119.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharidas remove this line and I am 👍 as the title says it is intended to split audit log messages from app log. makes sense IMO. For the future we should consider [monolog[(https://github.com/Seldaek/monolog/) anyway. All nice kids on the block seem to use it...

@pmaier1
Copy link
Contributor

pmaier1 commented Mar 22, 2017

@pmaier1 see #27443 (comment)
As we're running out of time I suggest keeping this PR for the short term (it's working and close to completion) and considering @jvillafanez's proposal for later. And for that also have a proper design session.
The current approach in this PR is an extension of something that already exists, so a quick addition.

Yes, fine. Nevertheless we should not forget about the better approach. I will link the comment in the platform ticket.

@mmattel
Copy link
Contributor

mmattel commented Mar 22, 2017

Please also fix config.sample.php:

  • according the code, there is no more shared_secret parameter
  • describe what the new logfile parameter is used for
  • Based on @butonic response, see above: add: Not applicapable when using syslog

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch 2 times, most recently from 5f42b8d to 603fb29 Compare March 23, 2017 08:37
@sharidas
Copy link
Contributor Author

@mmattel By mistake the shared_secret parameter was deleted. I have tried to add the same in the updated PR. I have tried to provide a brief description of the logfile parameter used in the config. I have added as a Note in the logfile sections saying: Not applicapable when using syslog Let me know the new PR looks ok or requires changes.

Thanks all for the feedback.

@mmattel
Copy link
Contributor

mmattel commented Mar 23, 2017

Not tested, looks good from my POV 👍

&& in_array($app, $multipleConditions['apps'], true)) {
$minLevel = Util::DEBUG;
if (isset($multipleConditions['logfile'])) {
$logConditionFile = $multipleConditions['logfile'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can break the for loop after this


// if token is found in the request change set the log condition to satisfied
if ($request && hash_equals($multipleConditions['shared_secret'], $request->getParam('log_secret'))) {
$this->logConditionSatisfied = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can break the for loop after this

$this->logConditionSatisfied = true;
// if the user matches set the log condition to satisfied
if ($user !== null && in_array($user->getUID(), $multipleConditions['users'], true)) {
$this->logConditionSatisfied = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can break the for loop after this

*/
public static function write($app, $message, $level) {
public static function write($app, $message, $level, $conditionalLogFile=null) {
Copy link
Member

@butonic butonic Mar 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oc code style adds spaces left and right of = make it $conditionalLogFile = null to increase readability

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch from f7d08e5 to e26601c Compare March 23, 2017 11:19
@butonic
Copy link
Member

butonic commented Mar 23, 2017

marked as breaking change

'apps' => ['files'],
[
'shared_secret' => '57b58edb6637fe3059b3595cf9c41b9',
'users' => ['user1'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent all lines the same

],
[
'shared_secret' => '57b58edb6637fe3059b3595cf9c41b9',
'users' => ['user1'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent all lines the same

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch 2 times, most recently from b09b81f to 31f9b99 Compare March 23, 2017 12:44
@sharidas
Copy link
Contributor Author

@butonic I have updated the PR. I am trying to update the tests/lib/LoggerTest.php to incorporate new feature.

@@ -266,25 +274,44 @@ public function log($level, $message, array $context = []) {
if($this->logConditionSatisfied === null) {
// default to false to just process this once per request
$this->logConditionSatisfied = false;
if(!empty($logCondition)) {
if(!empty($logConditions)) {
// This is a fall back code so that older versions log.condition will not break.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, thinking about this how about also renaming log.condition to log.conditions. that would make fallback code easier (just check if log.condition is set, if so add it as a new element to the array the config has for log.conditions)

documnetation would need to be updated, existing configs still work and the option has a semantically correct name (plural form)

@PVince81 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will require matching unit tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharidas ok, please also rename the config option from log.condition to log.conditions. will require changing the fallback code below ... should make it easier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@butonic I have renamed the option from log.condition to log.conditions, in the updated PR.

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch from 31f9b99 to ab7143e Compare March 23, 2017 15:38
@sharidas
Copy link
Contributor Author

sharidas commented Mar 23, 2017

@mmattel @butonic I have updated the PR. Let me know if more changes are required.

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch 2 times, most recently from cea143d to 8b5757c Compare March 24, 2017 11:31
@sharidas
Copy link
Contributor Author

@butonic I have updated the PR. Let me know if further refinements or changes are to be incorporated in this PR.

$logCondition = $this->config->getValue('log.condition', []);
$logConditions = $this->config->getValue('log.conditions', []);
if (empty($logConditions)) {
$logConditions = $this->config->getValue('log.condition', []);
Copy link
Member

@butonic butonic Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$logConditions[] = $this->config->getValue('log.condition', []);
old log.condition needs to become an element of the new array.

$tmpArray['apps'] = $logConditions['apps'];
}
$logConditions = [$tmpArray];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole fallback code is obsolete with https://github.com/owncloud/core/pull/27443/files#r107885570

->method('getValue')
->will(($this->returnValueMap([
['loglevel', Util::WARN, Util::WARN],
['log.condition', [], [['apps' => ['files'], 'logfile' => '/tmp/test.log']]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'log.conditions'

@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch from 8b5757c to ac1e454 Compare March 24, 2017 12:03
@sharidas
Copy link
Contributor Author

@butonic As per the review comments I have replaced the log.conditions in the LoggerTest.php file. I have also updated the PR to include the changes recommended:

  1. $logConditions[] = $this->config->getValue('log.condition', []);
  2. Removed the fallback code.
    Let me know if more changes are required.

@PVince81
Copy link
Contributor

@sharidas please fix the failing unit test

This change will help users to separate logs
for each apps based on conditional logging.
Changed usage of log.condition to
log.conditions.

Signed-off-by: Sujith H <sharidasan@owncloud.com>
@sharidas sharidas force-pushed the separate-audit-logs-sharidas branch from ac1e454 to a50b8db Compare March 24, 2017 13:13
@sharidas
Copy link
Contributor Author

sharidas commented Mar 24, 2017

Updated the PR by fixing the failing unit test.

@butonic
Copy link
Member

butonic commented Mar 24, 2017

I think this is ok now. Jenkins is happy. 👍

@PVince81 PVince81 merged commit 6f8bfca into master Mar 24, 2017
@PVince81 PVince81 deleted the separate-audit-logs-sharidas branch March 24, 2017 16:04
@lock
Copy link

lock bot commented Aug 3, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants