Wiki source code of Compteur de téléchargements Python
Last modified by Mélodie on 2026/05/07 13:56
Hide last authors
| author | version | line-number | content |
|---|---|---|---|
| |
15.1 | 1 | == Objective == |
| |
1.1 | 2 | |
| |
15.1 | 3 | Set up an automatic download counter for ISO files hosted on ##downloads.linuxvillage.org##, with: |
| |
1.1 | 4 | |
| |
15.1 | 5 | * Daily parsing of Apache logs |
| 6 | * Deduplication by IP + file + day (to avoid false duplicates from HTTP 206 Partial Content requests in resumed downloads) | ||
| 7 | * Month-by-month counter aggregation in a JSON file | ||
| 8 | * Automatic generation of a public page listing available ISOs | ||
| 9 | * Automatic generation of a private monthly statistics page | ||
| |
1.1 | 10 | |
| |
15.1 | 11 | == Prerequisites == |
| |
1.1 | 12 | |
| |
15.1 | 13 | * Python 3 installed (##/usr/bin/python3##) |
| 14 | * Root access to the server | ||
| 15 | * Apache logs available at ##/var/log/apache2/downloads-ssl-access.log*## | ||
| 16 | * ##cron## package installed: ##apt install cron## | ||
| |
1.1 | 17 | |
| |
15.1 | 18 | == Script Installation == |
| |
1.1 | 19 | |
| |
15.1 | 20 | Create the directory and place the script there (attachment to this page, to be renamed without the final ##.txt##): |
| |
1.1 | 21 | |
| 22 | {{code language="bash"}} | ||
| 23 | mkdir -p /opt/download_stats | ||
| |
15.1 | 24 | # Place download_stats.py in this directory |
| |
1.1 | 25 | chmod 750 /opt/download_stats/download_stats.py |
| 26 | {{/code}} | ||
| 27 | |||
| |
15.1 | 28 | == .stats Directory Structure == |
| |
1.1 | 29 | |
| |
15.1 | 30 | The script automatically creates and manages the hidden ##.stats## directory at the root of the download space: |
| |
1.1 | 31 | |
| 32 | {{code language="bash"}} | ||
| 33 | mkdir /var/www/file-server/.stats | ||
| 34 | {{/code}} | ||
| 35 | |||
| |
15.1 | 36 | This directory will contain: |
| |
1.1 | 37 | |
| |
15.1 | 38 | * ##counts.json## : cumulative history of monthly counters |
| 39 | * ##index.html## : private statistics page | ||
| |
1.1 | 40 | |
| |
15.1 | 41 | The statistics page is accessible at the URL: |
| |
1.1 | 42 | ##https:~/~/downloads.linuxvillage.org/.stats/## |
| 43 | |||
| |
15.1 | 44 | This URL is accessible but not publicly referenced. |
| |
1.1 | 45 | |
| |
15.1 | 46 | == Cron Configuration == |
| |
1.1 | 47 | |
| |
15.1 | 48 | As root: |
| |
1.1 | 49 | |
| 50 | {{code language="bash"}} | ||
| 51 | crontab -e | ||
| 52 | {{/code}} | ||
| 53 | |||
| |
15.1 | 54 | Add the following line (daily execution at 01:00 UTC): |
| |
1.1 | 55 | |
| 56 | {{code language="bash"}} | ||
| 57 | 0 1 * * * /usr/bin/python3 /opt/download_stats/download_stats.py | ||
| 58 | {{/code}} | ||
| 59 | |||
| |
15.1 | 60 | == JSON Reset == |
| |
1.1 | 61 | |
| |
15.1 | 62 | If needed (structure change, counter reset): |
| |
1.1 | 63 | |
| 64 | {{code language="bash"}} | ||
| 65 | echo '{}' > /var/www/file-server/.stats/counts.json | ||
| 66 | /usr/bin/python3 /opt/download_stats/download_stats.py | ||
| 67 | {{/code}} | ||
| 68 | |||
| |
15.1 | 69 | == Generated Pages == |
| |
1.1 | 70 | |
| |
15.1 | 71 | The script generates two HTML pages on each run: |
| |
1.1 | 72 | |
| |
15.1 | 73 | ; Public page |
| 74 | : ##/var/www/file-server/index.html## → lists available ISOs with their size, upload date, and links to checksums (.md5, .sha512). Content is automatically detected by directory scanning. The page follows the graphic identity of https://linuxvillage.org (colors, fonts, layout). | ||
| |
1.1 | 75 | |
| |
15.1 | 76 | ; Statistics page (private) |
| 77 | : ##/var/www/file-server/.stats/index.html## → table of downloads month by month, with per-file totals and grand total. | ||
| |
1.1 | 78 | |
| |
15.1 | 79 | == Script Logic == |
| |
1.1 | 80 | |
| |
15.1 | 81 | ; Log parsing |
| 82 | : The script scans all ##downloads-ssl-access.log*## files (including gzipped files from log rotation). Only GET and HEAD requests for ##.iso## files with HTTP status codes 200 or 206 are processed. | ||
| |
1.1 | 83 | |
| |
15.1 | 84 | ; Deduplication |
| 85 | : For each retained line, a key ##(ip, file, day)## is constructed. If this key has already been seen in the same run, the line is ignored. This prevents counting a resumed download multiple times when it generates multiple 206 requests. | ||
| |
1.1 | 86 | |
| |
15.1 | 87 | ; Monthly JSON structure |
| 88 | : Counters are stored by file and by month (##YYYY-MM##). On each run, the script merges new counters with the historical data by taking the maximum of the two values → this preserves data prior to the log retention window (14 days) without creating double-counting. | ||
| |
1.1 | 89 | |
| |
15.1 | 90 | ; Directory scanning |
| 91 | : The list of files displayed on the HTML pages is built dynamically on each run. Adding a new ISO version requires no manual intervention. | ||
| |
1.1 | 92 | |
| |
15.1 | 93 | == Attachment == |
| |
1.1 | 94 | |
| |
15.1 | 95 | The ##download_stats_en.py.txt## file attached to this page is the source Python script. |
| 96 | Rename to ##download_stats_en.py## after download, then: | ||
| |
1.1 | 97 | |
| 98 | {{code language="bash"}} | ||
| |
15.1 | 99 | chmod 750 /opt/download_stats/download_stats_en.py |
| |
1.1 | 100 | {{/code}} |