Skip to content

Command line utility, and mini framework, to harvest real time statistics on live log files over different periods of time expressed in seconds, as well as to report bursts where the average logs per second becomes greater than a predefined value.

License

Notifications You must be signed in to change notification settings

marregui/logpulse

Repository files navigation

LogPulse

This project is a command line utility that can be used to monitor a file, '/tmp/access.log' by default, and report general statistics and high traffic events periodically.

Every 10 seconds, general statistics are reported (stdout) for all data appended to the file during those 10 seconds.

Every 120 seconds the associated data are analysed:

  • whenever the total traffic exceeds a certain threshold, on average, a message 'High traffic' is reported.
  • whenever the total traffic drops again below the threshold, on average, another message 'Back to normal traffic' is reported.
  • both messages give details about the time when the threshold was crossed. The default value for the threshold is 10.0 requests per second.

Notice that two minutes worth of log data are kept in memory. Let's do some calculations:

  • If average log line size in bytes: 200.

  • When average number of requests per second: 1e6

    200 * 1e6 * 120 = 24e9 bytes  ~ 22.35 GB
    
  • When average number of requests per second: 10

    200 * 10 * 120 = 24e3 bytes  ~ 23.4 MB
    

Consider partitioning long period schedules into smaller schedules that aggregate.

The contents of the file are expected to be lines in CLF format.

To change defaults, please provide a list of arguments containing the keys to override, each followed by the value. For example:

./logpulse -gsp 5 -f logpulse-store/testLogs.log                         

will work on file 'logpulse-store/testLogs.log' and will report 
general statistics every 5 seconds, instead of 10.

Available parameter keys:

-h | -help: Shows this text. This key takes not value.
                
-f | -file | -in: Path (absolute, or relative to the launch script's location) 
        of the file being processed. 
        Value type: text, default: /tmp/access.log
                        
-gsp | -generalStatsPeriod: Sets the period (seconds)  for general statistics 
        reporting. 
        Value type: int, default: 10
                         
-tgp | -trafficGaugePeriod: Sets the period (seconds) for high traffic gauge's 
        reporting. 
        Value type: int, default: 120
                        
-tgt | -trafficGaugeThreshold: Sets the threshold (requests per second on avg. 
        for the considered period) for high traffic reporting. 
        Value type: double, default: 10.0 

For development

Requirements: JDK17, Gradle (preferably 7.x.x, we use 7.4.2) are required.

To create the documentation: ./gradlew javadoc

To build the project: ./gradlew clean build

The main artefact resulting from the build can be found under: build/libs/logpulse-1.0-SNAPSHOT-all.jar, an uberjar containing all dependencies so that you may run the application with a command like:

java -Xmx1G -Dfile.encoding=UTF-8 -jar build/libs/logpulse-1.0-SNAPSHOT-all.jar -help

However, use the more convenient launch command:

./logpulse

The Main class contains:

    public static void main(String[] args) {
        Parameters parameters = Parameters.parseArgs(args);
        Scheduler<CLF> scheduler = new Scheduler<>(new CLFReadoutHandler(parameters.file), false);
        scheduler.setPeriodicSchedule(new GeneralStatsView(System.out));
        scheduler.setPeriodicSchedule(new HighTrafficGauge(System.out));
        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            try {
                scheduler.stop();
            } catch (IllegalStateException ignore) {
                // it means it is already stopped, likely because
                // the parent folder has been deleted
            }
        }, "logpulse-shutdown-hook"));
        scheduler.start();
    }

which hints at the use of this project as a framework that gives you efficient readout of CLF log files from an ever growing file, and a mechanism to set periodic schedules to act on them.

About

Command line utility, and mini framework, to harvest real time statistics on live log files over different periods of time expressed in seconds, as well as to report bursts where the average logs per second becomes greater than a predefined value.

Topics

Resources

License

Stars

Watchers

Forks