Python Download Counter

Last modified by Mélodie on 2026/05/07 13:56

Objective

Set up an automatic download counter for ISO files hosted on downloads.linuxvillage.org, with:

  • Daily parsing of Apache logs
  • Deduplication by IP + file + day (to avoid false duplicates from HTTP 206 Partial Content requests in resumed downloads)
  • Month-by-month counter aggregation in a JSON file
  • Automatic generation of a public page listing available ISOs
  • Automatic generation of a private monthly statistics page

Prerequisites

  • Python 3 installed (/usr/bin/python3)
  • Root access to the server
  • Apache logs available at /var/log/apache2/downloads-ssl-access.log*
  • cron package installed: apt install cron

Script Installation

Create the directory and place the script there (attachment to this page, to be renamed without the final .txt):

mkdir -p /opt/download_stats
# Place download_stats.py in this directory
chmod 750 /opt/download_stats/download_stats.py

.stats Directory Structure

The script automatically creates and manages the hidden .stats directory at the root of the download space:

mkdir /var/www/file-server/.stats

This directory will contain:

  • counts.json : cumulative history of monthly counters
  • index.html : private statistics page

The statistics page is accessible at the URL:
https://downloads.linuxvillage.org/.stats/

This URL is accessible but not publicly referenced.

Cron Configuration

As root:

crontab -e

Add the following line (daily execution at 01:00 UTC):

0 1 * * * /usr/bin/python3 /opt/download_stats/download_stats.py

JSON Reset

If needed (structure change, counter reset):

echo '{}' > /var/www/file-server/.stats/counts.json
/usr/bin/python3 /opt/download_stats/download_stats.py

Generated Pages

The script generates two HTML pages on each run:

Public page
/var/www/file-server/index.html → lists available ISOs with their size, upload date, and links to checksums (.md5, .sha512). Content is automatically detected by directory scanning. The page follows the graphic identity of https://linuxvillage.org (colors, fonts, layout).
Statistics page (private)
/var/www/file-server/.stats/index.html → table of downloads month by month, with per-file totals and grand total.

Script Logic

Log parsing
The script scans all downloads-ssl-access.log* files (including gzipped files from log rotation). Only GET and HEAD requests for .iso files with HTTP status codes 200 or 206 are processed.
Deduplication
For each retained line, a key (ip, file, day) is constructed. If this key has already been seen in the same run, the line is ignored. This prevents counting a resumed download multiple times when it generates multiple 206 requests.
Monthly JSON structure
Counters are stored by file and by month (YYYY-MM). On each run, the script merges new counters with the historical data by taking the maximum of the two values → this preserves data prior to the log retention window (14 days) without creating double-counting.
Directory scanning
The list of files displayed on the HTML pages is built dynamically on each run. Adding a new ISO version requires no manual intervention.

Attachment

The download_stats.py.txt file attached to this page is the source Python script.
Rename to download_stats.py after download, then:

chmod 750 /opt/download_stats/download_stats.py

Langues / Languages

🇫🇷 Français | 🇬🇧 English