The script has been developed and tested in the following environment:
Sample configuration:
access_log daemon:/host/database/table/username/password squid
to leave a parameter unspecified use a double slash: access_log daemon://database/table/username/password squid
Default ``DBI:mysql:database=squid''
The path to the access log file is used to provide the database connection parameters.
access_log daemon:/mysql_host:port/database/table/username/password squid
The 'daemon' prefix is mandatory and tells squid that the logfile_daemon helper is to be used instead of the normal file logging.
The last parameter tells squid which log format to use when writing lines to the log daemon. Presently squid format is supported.
To leave all fields to their default values, you can use a single slash:
access_log daemon:/ squid
To specify only the database password, which by default is empty, you must leave unspecified all the other parameters by using null strings:
access_log daemon://///password squid
logfile_daemon directive
This is the current way of telling squid where the logfile daemon resides.
logfile_daemon /path/to/squid/libexec/logfile-daemon_mysql.pl
The script must be copied to the location specified in the directive.
Database
Create the database:
CREATE DATABASE squid_log;
User
Create the user:
GRANT INSERT,SELECT,CREATE ON squid_log.* TO 'squid'@'localhost' IDENTIFIED BY 'squid'; FLUSH PRIVILEGES;
Note that only CREATE, INSERT and SELECT privileges are granted to the 'squid' user. This ensures that the logfile daemon script cannot change or modify the log entries.
Table
The Daemon will attempt to initialize this table if none exists when it starts.
The table created should look like:
CREATE TABLE access_log ( id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, time_since_epoch DECIMAL(15,3), time_response INTEGER, ip_client CHAR(15), ip_server CHAR(15), http_status_code VARCHAR(10), http_reply_size INTEGER, http_method VARCHAR(20), http_url TEXT, http_username VARCHAR(20), http_mime_type VARCHAR(50), squid_hier_status VARCHAR(20), squid_request_status VARCHAR(20) );
SELECT DISTINCT ip_client FROM access_log;
SELECT DATE(FROM_UNIXTIME(time_since_epoch)) AS date_day, COUNT(*) AS num_of_requests FROM access_log GROUP BY 1 ORDER BY 1;
SELECT squid_request_status, COUNT(*) AS n FROM access_log GROUP BY squid_request_status ORDER BY 2 DESC;
To calculate the percentage of each request status:
SELECT squid_request_status, (COUNT(*)/(SELECT COUNT(*) FROM access_log)*100) AS percentage FROM access_log GROUP BY squid_request_status ORDER BY 2 DESC;
To distinguish only between HITs and MISSes:
SELECT 'hits', (SELECT COUNT(*) FROM access_log WHERE squid_request_status LIKE '%HIT%') / (SELECT COUNT(*) FROM access_log)*100 AS percentage UNION SELECT 'misses', (SELECT COUNT(*) FROM access_log WHERE squid_request_status LIKE '%MISS%') / (SELECT COUNT(*) FROM access_log)*100 AS percentage;
SELECT '0..500', COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage FROM access_log WHERE time_response >= 0 AND time_response < 500 UNION SELECT '500..1000', COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage FROM access_log WHERE time_response >= 500 AND time_response < 1000 UNION SELECT '1000..2000', COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage FROM access_log WHERE time_response >= 1000 AND time_response < 2000 UNION SELECT '>= 2000', COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage FROM access_log WHERE time_response >= 2000;
SELECT http_mime_type, SUM(http_reply_size) as total_bytes FROM access_log GROUP BY http_mime_type ORDER BY 2 DESC;
SELECT ip_client, SUM(http_reply_size) AS total_bytes FROM access_log GROUP BY 1 ORDER BY 2 DESC;
Indexes should be created according to the queries that are more frequently run. The DDL script only creates an implicit index for the primary key column.
One solution would be to implement e.g. the ``rotate log'' command in a way that would calculate some summary values, put them in a ``summary table'' and then delete the lines used to calculate those values.
Similar cleanup code could be implemented in an external script and run periodically independently from squid log commands.
* Copyright (C) 1996-2021 The Squid Software Foundation and contributors * * Squid software is distributed under GPLv2+ license and includes * contributions from numerous individuals and organizations. * Please see the COPYING and CONTRIBUTORS files for details.
Copyright (C) 2008 by Marcello Romani
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.
Report bugs or bug fixes using http://bugs.squid-cache.org/
Report serious security bugs to Squid Bugs <squid-bugs@lists.squid-cache.org>
Report ideas for new improvements to the Squid Developers mailing list <squid-dev@lists.squid-cache.org>
The Squid FAQ wiki http://wiki.squid-cache.org/SquidFaq
The Squid Configuration Manual http://www.squid-cache.org/Doc/config/