David Gessel
Audio File Analysis With Sox
Sox is a cool program, a “Swiss Army knife of sound processing,” and a useful tool for checking audio files that belongs in anyone’s audio processing workflow. I thought it might be useful for detecting improperly encoded audio files or those files that have decayed due to bit rot or cosmic rays or other acoustic calamities and it is.
Sox has two statistical output command line options, “stat” and “stats,” which output different but useful data. What’s useful about sox for this, that some metadata checking programs (like the very useful MP3Diags-unstable) don’t do is actually decode the file and compute stats from the actual audio data. This takes some time, about 0.7 sec for a typical (5 min) audio file. This may seem fast, it is certainly way faster than real time, but if you want to process 22,000 files, it will take 4-5 hours.
Some of the specific values that are calculated seem to mean something obvious, like “Flat factor” is related to the maximum number of identical samples in a row – which would make the waveform “flat.” But the computation isn’t linear and there is a maximum value (>30 is a bad sign, usually).
So I wrote a little program to parse out the results and generate a csv file of all of the results in tabular form for analysis in LibreOffice Calc. I focused on a few variables I thought might be indicative of problems, rather than all of them:
- DC offset—which you’d hope was always close to zero.
- Min-Max level difference—min and max should be close to symmetric and usually are, but not always.
- RMS pk dB—which is normally set for -3 or -6 dB, but shouldn’t peak at nearly silent, -35 dB.
- Flat factor—which is most often 0, but frequently not.
- Pk count—the number of samples at peak, which is most often 2
- Length s—the length of the file in seconds, which might indicate a play problem
After processing 22,000 files, I gathered some statistics on what is “normal” (ish, for this set of files), which may be of some use in interpreting sox results. The source code for my little bash script is at the bottom of the post.
DC Bias
DC Bias really should be very close to zero, and the most files are fairly close to zero, but some in the sample had a bias of greater than 0.1, which even so has no perceptible audio impact.
Min Level – Max Level
Min level is most often normalized to -1 and max level most often normalized to +1, which would yield a difference of 2 or a difference of absolute values of 0 (as measured) and this is the most common result (31.13%). A few files, 0.05% or so have a difference greater than 0.34, which is likely to be a problem and is worth a listen.
RMS pk dB
Peak dB is a pretty important parameter to optimize as an audio engineer and common settings are -6dB and -3dB for various types of music, however if a set of files is set as a group, individual files can be quite a bit lower or, sometimes, a bit higher. Some types of music, psychobilly for example, might be set even a little over -3 dB. A file much above -3 dB might have sound quality problems or might be corrupted to be just noise; 0.05% of files have a peak dB over -2.2 dB. A file with peak amplitudes much below -30 dB may be silent and certainly will be malto pianissimo; 0.05% of files have a peak dB below -31.2 dB.
A very quiet sample, with a Pk dB of -31.58, would likely have a lot of aliasing due to the entire program using only about 10% of the total head room.
Flat factor
Flat factor is a complicated measure, but is roughly (but not exactly) the maximum number of consecutive identical samples. @AkselA offered a useful oneliner (sox -n -p synth 10 square 1 norm -3 | sox - -n stats
) to verify that it is not, exactly, just a run of identical values and just what it actually is, isn’t that well documented. Whatever it is exactly, 0 is the right answer and 68% of files get it right. Only 0.05% of files have a flat factor greater than 27.
Pk count
Peak count is a good way to measure clipping. 0.05% of files have a pk count < 1000, but the most common value, 65.5%, is 2, meaning most files are normalized to peak at 100%… exactly twice (log scale chart, the peak is at 2).
As an example, a file with levels set to -2.31 and a flat factor of only 14.31 but with a Pk count of 306,000 looks like this in Audacity with “Show Clipping” on, and yet sounds kinda like you’d think it is supposed to. Go figure.
Statistics
What’s life without statistics, sample pop: 22,096 files. 205 minutes run time or 0.56 seconds per file.
Stats | DC bias | min amp | max amp | min-max | avg pk dB | flat factor | pk count | length s |
Mode | 0.000015 | -1 | 1 | 0 | -10.05 | 0.00 | 2 | 160 |
Count at Mode | 473 | 7,604 | 7,630 | 6,879 | 39 | 14,940 | 14,472 | 14 |
% at mode | 2.14% | 34.41% | 34.53% | 31.13% | 0.18% | 67.61% | 65.50% | 0.06% |
Average | 0.00105 | -0.80 | 0.80 | 0.03 | -10.70 | 2.03 | 288.51 | 226.61 |
Min | 0 | -1 | 0.0480 | 0 | -34.61 | 0 | 1 | 4.44 |
Max | 0.12523 | -0.0478 | 1 | 0.497 | -1.25 | 129.15 | 306,000 | 7,176 |
Threshold | 0.1 | -0.085 | 0.085 | 0.25 | -2.2 | 27 | 1,000 | 1,200 |
Count @ Thld | 3 | 11 | 10 | 68 | 12 | 12 | 35 | 45 |
% @ Thld | 0.01% | 0.05% | 0.05% | 0.31% | 0.05% | 0.05% | 0.16% | 0.20% |
Bash Script
#!/bin/bash ############################################################### # This program uses sox to analyize an audio file for some # common indicators that the actual file data may have issues # such as corruption or have been badly prepared or modified # It takes a file path as an input and outputs to stdio the results # of tests if that file exceeds the theshold values set below # or, if the last conditional is commented out, all files. # a typical invocation might be something like: # find . -depth -type f -name "*.mp3" -exec soxverify.sh {} > stats.csv \; # The code does not handle single or multi-track files and will # throw an error. If sox can't read the file it will throw an error # to the csv file. Flagged files probably warrant a sound check. ############################################## ### Set reasonable threshold values ########## # DC offset should be close to zero, but is almost never exactly # The program uses the absolute value of DC offset (which can be # neg or positive) as a test and is normalized to 1.0 # If the value is high, total fidelity might be improved by # using audacity to remove the bias and recompressing. # files that exceed the dc_offset_bias will be output with # Error Code "O" dc_offset_threshold=0.1 # Most files have fairly symmetric min_level and max_level # values. If the min and max aren't symmetric, there may # be something wrong, so we compute and test. 99.95% of files have # a delta below 0.34, files with a min_max_delta above # min_max_delta_threshold will be flagged EC "D" min_max_delta_threshold=0.34 # Average peak dB is a standard target for normalization and # replay gain is common used to adjust files or albums that weren't # normalized to hit that value. 99.95% of files have a # RMS_pk_dB of < -2.2, higher than that is weird, check the sound. # Exceeding this threshold generates EC "H" RMS_pk_dB_threshold=-2.2 # Extremely quiet files might also be indicative of a problem # though some are simply malto pianissimo. 99.95% of files have # a minimum RMS_pk_dB > -31.2 . Files with a RMS pk dB < # RMS_min_dB_threshold will be flagged with EC "Q" RMS_min_dB_threshold=-31.2 # Flat_factor is a not-linear measure of sequential samples at the # same level. 68% of files have a flat factor of 0, but this could # be intentional for a track with moments of absolute silence # 99.95% of files have a flat factor < 27. Exceeding this threshold # generates EC "F" flat_factor_threshold=27 # peak_count is the number of samples at maximum volume and any value > 2 # is a strong indicator of clipping. 65% of files are mixed so that 2 samples # peak at max. However, a lot of "loud" music is engineered to clip # 8% of files have >100 "clipped" samples and 0.16% > 10,000 samples # In the data set, 0.16% > 1000 samples. Exceeding this threshold # generates EC "C" pk_count_threshold=1000 # Zero length (in seconds) or extremely long files may be, depending on # one's data set, indicative of some error. A file that plays back # in less time than length_s_threshold will generate EC "S" # file playing back longer than length_l_threshold: EC "L" length_s_threshold=4 length_l_threshold=1200 # Check if a file path is provided as an argument if [ "$#" -ne 1 ]; then echo "Usage: $0 <audio_file_path>" exit 1 fi audio_file="$1" # Check if the file exists if [ ! -f "$audio_file" ]; then echo "Error: File not found - $audio_file" exit 1 fi # Run sox with -stats option, remove newlines, and capture the output sox_stats=$(sox "$audio_file" --replay-gain off -n stats 2>&1 | tr '\n' ' ' ) # clean up the output sox_stats=$( sed 's/[ ]\+/ /g' <<< $sox_stats ) sox_stats=$( sed 's/^ //g' <<< $sox_stats ) # Check if the output contains "Overall" as a substring if [[ ! "$sox_stats" =~ Overall ]]; then echo "Error: Unexpected output from sox: $1" echo "$sox_stats" echo "" exit 1 fi # Extract and set variables dc_offset=$(echo "$sox_stats" | cut -d ' ' -f 6) min_level=$(echo "$sox_stats" | cut -d ' ' -f 11) max_level=$(echo "$sox_stats" | cut -d ' ' -f 16) RMS_pk_dB=$(echo "$sox_stats" | cut -d ' ' -f 34) flat_factor=$(echo "$sox_stats" | cut -d ' ' -f 50) pk_count=$(echo "$sox_stats" | cut -d ' ' -f 55) length_s=$(echo "$sox_stats" | cut -d ' ' -f 67) # convert DC offset to absolute value dc_offset=$(echo "$dc_offset" | tr -d '-') # convert min and max_level to absolute values: abs_min_lev=$(echo "$min_level" | tr -d '-') abs_max_lev=$(echo "$max_level" | tr -d '-') # compute delta and convert to abs value min_max_delta_int=$(echo "abs_max_lev - abs_min_lev" | bc -l) min_max_delta=$(echo "$min_max_delta_int" | tr -d '-') # parss pkcount pk_count=$( sed 's/k/000/' <<< $pk_count ) pk_count=$( sed 's/M/000000/' <<< $pk_count ) # Compare values against thresholds threshold_failed=false err_code="ERR: " # Offset bad check if (( $(echo "$dc_offset > $dc_offset_threshold" | bc -l) )); then threshold_failed=true err_code+="O" fi # Large delta check if (( $(echo "$min_max_delta >= $min_max_delta_threshold" | bc -l) )); then threshold_failed=true err_code+="D" fi # Mix set too high check if (( $(echo "$RMS_pk_dB > $RMS_pk_dB_threshold" | bc -l) )); then threshold_failed=true err_code+="H" fi # Very quiet file check if (( $(echo "$RMS_pk_dB < $RMS_min_dB_threshold" | bc -l) )); then threshold_failed=true err_code+="Q" fi # Flat factor check if (( $(echo "$flat_factor > $flat_factor_threshold" | bc -l) )); then threshold_failed=true err_code+="F" fi # Clipping check - peak is max and many samples are at peak if (( $(echo "$max_level >= 1" | bc -l) )); then if (( $(echo "$pk_count > $pk_count_threshold" | bc -l) )); then threshold_failed=true err_code+="C" fi fi # Short file check if (( $(echo "$length_s < $length_s_threshold" | bc -l) )); then threshold_failed=true err_code+="S" fi # Long file check if (( $(echo "$length_s > $length_l_threshold" | bc -l) )); then threshold_failed=true err_code+="L" fi # for data collection purposes, comment out the conditional and the values # for all found files will be output. if [ "$threshold_failed" = true ]; then echo -e "$1" "\t" "$err_code" "\t" "$dc_offset" "\t" "$min_level" "\t" "$max_level" "\t" "$min_max_delta" "\t" "$RMS_pk_dB" "\t" "$flat_factor" "\t" "$pk_count" "\t" "$length_s" fi
Manually Update Time Zone Data on Android 10
One of the updates that stops when your carrier decides you have to buy a new phone to keep their profits up is the time zone data, which means as regions decide they will or won’t continue using standard time and will switch permanently to lazy people time (or not), time zone calculations start to fail, which can be awfully annoying when it causes you to miss flights or meetings. It is probably something you’ll want to keep up to date. Unfortunately, this requires root access to your phone because… profits depend on the velocity by which first world money is converted to e-waste to poison third world children. Yay.
Root requires reflashing your device, which means wiping all your data and apps and reinstalling them, so easier to do on a new phone than backing up and restoring and re-configuring all your apps. Sooner or later your vendor will stop supporting your device in an attempt to get you to throw it away and buy a new one and you’ll have to root it to keep it up to date and secure so you might as well do it now, void their stupid warranty, and take control of your device.
You should also take a moment to write your elected representatives and demand that they take civil action against this crap. Lets take a short rant break, shall we?
Planned obsolescence, death by security flaws, and vendor locks should be prosecuted, not just as illegal profiteering but as environmental crimes for needlessly flooding the world with e-waste. If you own a device you have the right to use it as you like and any entity that by omission or obfuscation of reasonable information needed to keep that device operational is depriving legitimate owners of rightful value. Willfully obstructing security updates, knowing full well the risks implied, is coercive if not extortion. Actively blocking the provision of third party services intended to mitigate these harms through barratry and legal extortion should be prosecuted aggressively. Everyone who has purchased a phone that has been intentionally and unfairly life-limited by non-replaceable batteries, intimidation of repair services, manipulation of the spare parts market, or restrictions or obfuscation of security updates is due refund of the value thus denied plus penalties.
Ah, that feels better, no?
Assuming you have a rooted phone, adb installed on your computer, and your TZ data is out of date, lets get it fixed, shall we? The problem is that TZ data comes from IANA, from here actually, and is versioned in a form like 2023c, the current as of now. That’s lovely but the format they provide is not compatible with android and needs to be transformed. Google seems to have some tools for this in the FOSS branch of Android, but it seems a little useless without a virtual environment, a PITA. But the good folks at LineageOS (yay, FOSS!!!) maintain their version of the tool with the thus created output data in their git, which we can use for all android devices (it seems). The files we need are in this directory: note that these are 2023a, but 2023c is identical to 2023a, reverting some changes made in 2023b because, I don’t know, the whole mess about getting up an hour earlier or later being some traumatic experience when it happens twice a year is catastrophic for people’s sense of well being, but when they get up at different times on days off than on work days, that doesn’t count or something. OMG. so drama. people. sometimes it hurts to be associated with them as a species. Not that I care, but stop messing around and just pick one. So many rant triggers in this whole mess.
Anyway, proceeding with the assumption your device is rooted and you have adb installed on your computer, the files needed are:
tzdata a binary file that if you view with a text editor should start with: tzdata2023a tzlookup.xml an xml file that should (nearly) start with: <timezones ianaversion="2023a"> tz_version a simple text file that should have one line: 003.001|2023a|001
Download the compressed .tgz archive of the output_data
directory from here by clicking on the [tgz]
text at the top right
You should get a .tgz archive, from which you want to extract:
tzlookup.xml
from theandroid
foldertzdata
from theiana
foldertz_version
from theversion
folder
Here’s the tricky bit, you gotta get these files to the right places. So I mounted my android on my computer and created a folder TZData
in Downloads
and copied the files there, this resolved to /data/media/0/Download/TZdata/
on my device. While you’re there, make a folder like oldTZ
in the same place for backup. Everything else is done by command line via adb.
(comments are demarked with "#", the prompt is assumed) # get shell on your device adb shell # get root, if this fails, you don't have root, bummer, you don't really own your device. su root # verify your tz data is where mine was, if so copypasta should be safe. find / -name tzdata 2>/dev/null #output for me looks like some are symlinks /apex/com.android.tzdata/etc/tz/tzdata /apex/com.android.tzdata@290000000/etc/tz/tzdata /apex/com.android.runtime/etc/tz/tzdata /apex/com.android.runtime@1/etc/tz/tzdata /system/apex/com.android.runtime.release/etc/tz/tzdata /system/apex/com.android.tzdata/etc/tz/tzdata /system/usr/share/zoneinfo/tzdata # did ya get the same or close enough to figure out what to do next? good. # Backup your old stuff cp /system/apex/com.android.tzdata/etc/tz/* /data/media/0/Download/oldTZ # your directories are read only, so you need to fix that, scary but reversible mount -o rw,remount / mount -o rw,remount /apex/com.android.tzdata mount -o rw,remount /apex/com.android.runtime # copy the new files over the old files, the last location is legacy and doesn't # seem to have a copy of tzlookup.xml, so we don't put a new one there, but check ls /system/usr/share/zoneinfo # only tzdata and tz_version? Good. cp /data/media/0/Download/TZdata/* /apex/com.android.tzdata/etc/tz cp /data/media/0/Download/TZdata/* /apex/com.android.runtime/etc/tz cp /data/media/0/Download/TZdata/* /system/apex/com.android.tzdata/etc/tz cp /data/media/0/Download/TZdata/tz_version /system/usr/share/zoneinfo cp /data/media/0/Download/TZdata/tzdata /system/usr/share/zoneinfo # all done, now we just gotta read-only those directories again mount -o ro,remount / mount -o ro,remount /apex/com.android.tzdata mount -o ro,remount /apex/com.android.runtime # and why not reboot from the command line? reboot
That was fairly painless once you know what to do and have root, no? it worked for me, my phone rebooted and the time zone database appears to be updated. YMMV, hopefully not the reboot successfully part but bricking a phone is a risk because, you know, profits. After that tz file surgery I created a new event in a US time zone that recently changed their daylight savings to pacify the crazies and it seemed to work as expected.
Projecting Qubit Realizations to the Cryptopocalpyse Date
RSA 2048 is predicted to fail by 2042-01-15 at 02:01:28.
Plan your bank withdrawals accordingly.
Way back in the ancient era of 2001, long before the days of iPhones, back when TV was in black and white and dinosaurs still roamed the earth, I delivered a talk on quantum computing at DEF CON 9.0. In the conclusion I offered some projections about the growth of quantum computing based on reported growth of qubits to date. Between the first qubit in 1995 and the 8 qubit system announced before my talk in 2001, qubits were doubling about every 2 years.
I drew a comparison with Moore’s law that computers double in power every 18 months, or as 2(years/1.5). A feature of quantum computers is that the power of a quantum computer increases as the power of the number of qubits, which is itself doubling at some rate, then two years, or as 22(years/2), or, in ASCII: Moore’s law is 2^(Y/1.5) and Gessel’s law is 2^2^(Y/2).
As far as I know, nobody has taken up my formulation of quantum computing power as a time series double exponential function of the number of qubits in a parallel structure to Moore’s law. It seems compelling, despite obviously having a few (minor) flaws. A strong counter argument to my predictions is that useful quantum computers require stable, actionable qubits, not noisy ones that might or might not be in a useful state when measured. Data on stable qubit systems is still too limited to extrapolate meaningfully, though a variety of error correction techniques have been developed in the past two decades to enable working, reliable quantum computers. Those error correction techniques work by combining many “raw” qubits into a single “logical” qubit at around a 10:1 ratio, which certainly changes the regression substantially, though not the formulation of my “law.”
I generated a regression of qubit growth along the full useful quantum computer history, 1998–2023, and performed a least-squares fit to an exponential doubling period and got 3.376 years, quite a bit slower than the heady early years’ 2.0 doubling rate. On the other hand, fitting an exponential curve to all announcements in the modern 2016–2023 period yields a doubling period of only 1.074 years. The qubit doubling period is only 0.820 years if we fit to just the most powerful quantum computers released, ignoring various projects’ lower-than-maximum qubit count announcements; I can see arguments for either though selected the former as somewhat less aggressive.
From this data, I offer a formulation of what I really hope someone else somewhere will call, at least once, “Gessel’s Law,” P = 22(y/1.1) or, more generally given that we still don’t have enough data for a meaningful regression, P = 22(y/d); quantum computational power will grow as 2 to the power 2 to the power years over a doubling period which will become more stable as the physics advance.
Gidney & Ekra (of Google) published How to factor 2048-bit RSA integers in 8 hours using 20 million noisy qubits, 2021-04-13. So far for the most efficient known (as in not hidden behind classification, should such classified devices exist) explicit algorithm for cracking RSA. The qubit requirement, 2×10⁷, is certainly daunting, but with a doubling time of 1.074 years, we can expect to have a 20,000,000 qubit computer by 2042. Variations will also crack Diffie-Hellman and even elliptic curves, creating some very serious security problems for the world not just from the failure of encryption but the exposure of all so-far encrypted data to unauthorized decryption.
Based on the 2016–2023 all announcements regression and Gidney & Ekra, we predict RSA 2048 will fall on 2042-01-15 at 2am., a prediction not caveated by the error correction requirement for stable qubits as they count noisy, raw, cubits as I do. As a validity check, my regression predicts “Quantum Supremacy” right at Google’s 2022 announcement.
AI PSYOPS are changing strategic messaging
Social media fundamentally changed strategic messaging, cutting the cost per effect by at least two orders of magnitude, probably more. It has become the most cost effective munition in the global arsenal. Even when it took teams of actual humans to populate content and troll farms to flood social media with messaging intended to result in a desired outcome, for example to swing an election, start a war, damage alliances, break treaties, or generate support for one particular policy, foreign or domestic, or another, it was still a revolution in reduced cost warfare.
Take Operation INFEKTION, the active measure campaign run by the KGB starting in about 1983″to create a favorable opinion for us abroad that this disease (AIDS) is the result of secret experiments with a new type of biological weapon by the secret services of the USA and the Pentagon that spun out of control.”
This campaign leveraged assets put in place as far back as 1962 and eventually consumed the authority of Prof. Jakob Segal as a self-referential authoritative citation. After a little more than a decade of relentless media placements of strategic messaging, even in the United States more than 25% of the population had been convinced AIDS was a government project and 12% had been manipulated into believing it was created and spread by the CIA. This project was tremendously successful despite having to overcome the then standard and generally principled editorial gate keeping that protected “traditional” media from abuse and cooptation by manufacturing plausible chains of authority and fabricating deep and broad reference chains to thwart fact checking.
By the 2016 Election, the KGB’s successors, the IRA and GRU, efficiently and expertly leveraged social media to achieve even more impressive results, possibly winning the most significant military battle in history, to alter the outcome of the US election at a cost of only a few billion dollars and within a mere 2-3 years of effort.
Any-to-any publishing circumvents editorial protections (he writes without a trace of irony). What might otherwise be a limitation of psyop being clearly outside any authorative endorsement, something that required the consumption of an asset like Jakob Segal to achieve in an earlier era, has been overwhelmingly diminished by a parallel effort to destroy trust in institutions and authority creating a direct path to shape the beliefs of targets through mass individualization of messaging unchecked by any need for longitudinal reputation building.
That the 2016 effort still cost billions, requiring a massive capacity build of English speaking, internet savvy teams inducted into “troll farms,” (many ironically located in Bulgaria given that county’s role in Operation INFEKTION) may already be obsolete just 8 years later.
Many have written about ChatGPT representing some sort of existential risk to humanity’s future, some quick resolution to the Fermi Paradox, but the real risk is an acceleration of the destruction of objective truth and the substitution conceptual paradigms that align with strategic outcomes.
As an example, let me introduce to you Dr. Alexander Greene, a person ChatGPT tells us “is a highly esteemed and celebrated professor with a remarkable career dedicated to advancing the fields of green energy and engineering.”
Obviously, it’s hard to really believe Dr. Greene without seeing the man himself, but fortunately we have a tool for that too:
A few images from bing/Dall-E and we can create a very convincing article that would easily pass muster as an authoritative discussion on the benefits of continuing to burn fossil fuels with minimal editing and formatting, just to cut out the caveats that ChatGPT inserts in counterfactual text requests we can have such pearls of wisdom to impart upon the world as:
Access to affordable and reliable energy is a crucial driver of economic development, and historically, fossil fuels have played a significant role in providing low-cost energy solutions. While there are concerns about the environmental impact of fossil fuels, particularly their contribution to climate change, it is essential to understand the benefits they have brought to the developing world and the potential consequences of increasing energy costs.
Read the whole synthetic article in pdf form below and consider the difficulty of finding a shared factual foundation in a world where it is trivial to synthesize plausible authority.
The Benefits of Low-Cost Energy from Fossil Fuels and the Impact of Increasing Energy Costs on Developing Nations, “by” “Dr. Alexander Greene” (ghost written by ChatGPT).
Mobotix Notifier in Python – get desktop messages from your cameras
I wrote a little code in python to act as a persistent, small footprint LAN listener for Mobotix cameras IP Notify events. If such a thing is useful to you, the code and a .exe compiled version are linked/inline. It works on both Windows and Linux as python code. For Windows there’s a humongous (14MB) .exe file use if you don’t want to install Python and mess with the command line in power shell.
Mobotix cameras have a pretty cool low-level feature by which you can program via the camera web interface a raw IP-packet event to send to a destination if the camera detects a trigger, for example motion, PIR over threshold, noise level, thermal trigger, or the various AI detectors available on the 7 series cameras. Mobotix had a simple notification application, but some of these older bits of code aren’t well supported any more and Linux support didn’t last long at the company, alas. The camera runs Linux, why you’d want a client appliance to run anything but Linux is beyond me, but I guess companies like to overpay for crappy software rather than use a much better, free solution.
I wanted something that would push an otherwise not intrusive notification when the camera triggered for something like a cat coming by for dinner, pushing a desktop notification. Optimally this would be done with broadcast packets over UDP, but Mobotix doesn’t support UDP broadcast IP Notify messaging yet, just TCP, so each recipient address (or DNS name) has to be specified on each camera, rather than just picking a port and having all the listeners tune into that port over broadcast. Hopefully that shortcoming will be fixed soon.
This code runs headless, there’s no interaction. From the command line just ./mobotix_notifier.py &
and off it goes. From windows, either the same for the savvy or double click the exe. All it does is listen on port 8008/TCP and if it gets a message from a camera, reach out and grab the current video image, iconify it, then push a notification using the OS’s notification mechanism which appears as a pop-up window for few seconds with a clickable link to open the camera’s web page. It works if you have one or a 100 cameras, but it is not intended for frequent events which would flood the desktop with annoyance, rather a front door camera that might message if someone’s at the door. In a monitoring environment, it might be useful for signaling critical events.
Mobotix Camera Set Up
On the camera side there are just two steps: setting up an IP-Notify action from the Admin Menu and then defining an Action Group from the Setup Menu to trigger it.
The title is the default “SimpleNotify” – that can be anything.
The Destination addresses are the IPs of the listener machines and port numbers. You can add as many as needed but for now it is not possible to send a UDP broadcast message as UDP isn’t supported yet. It may be soon, I’ve requested the capability and I expect the mechanism is just a front end for netcat (nc
) as it would be strange to write a custom packet generator when netcat is available. For now, no broadcast, just IP to IP, so you have to manually enumerate all listeners.
I have the profile set for sequential send to all rather than parallel just for debugging, devices further down the list will have lower latency with parallel send.
The data protocol is raw TCP/IP, no UDP option here yet…
The data type is plain text, which is easier to parse at the listener end. The data structure I’m using reads: $(id.nam), $(id.et0) | Time: $(fpr.timestamp) | Event: $(EVT.EST.ACTIVATED) | PIR: $(SEN.PIR) | Lux: $(SEN.LXL) | Temp: $(SEN.TOU.CELSIUS) | Thermal: $(SEN.TTR.CELSIUS
) but it can be anything that’s useful.
Mobotix cameras have a robust programming environment for enabling fairly complex “If This Then That” style operations and triggering is no exception. One might reasonably configure the Visual Alarm (now with multiple Frame Colors, another request of mine, so that you can have different visual indicators for different detected events, create different definitions at /admin/Visual Alarm Profiles), a fairly liberal criterion might be used to trigger recording, and a more strict “uh oh, this is urgent” criterion might be used to trigger pushing a message to your new listeners.
This config should be fairly obvious to anyone familiar with Mobotix camera configuration: it’s configured to trigger at all detected events but not more than once every 5 seconds. given it is pushing a desktop alert, a longer deadtime might be appropriate depending on the specifics of triggering events that are configured.
That’s all that’s needed on the camera end: when a triggering event occurs the camera will take action by making a TCP connection to the IP address enumerated on the selected port and, once the connection is negotiated push the text structure. All we need now is something to listen.
Python Set Up
The provided code can be run as a python “application” but python is an interpreted language and so needs the environment in which to interpret it properly configured. I also provide a compiled exe derived from the python code using PyInstaller, which makes it easier to run without Python on Windows where most users aren’t comfortable with command lines and also integrates more easily with things like startup applications and task manager and the like.
If you’re going to run the python command-line version, you can use these instructions for Windows, or these for Linux to set up Python. Just make sure to install a version more recent than 3.7 (you’d have to work at installing an older version than that). Then, once python is installed and working, install the libraries this script uses in either windows powershell or Linux shell as below. Note that
3 python
specifies the 3.x series of python vs. 2.x and is only necessary in systems with earlier version baggage like mine.
python[3] -m pip install plyer dnspython py-notifier pillow --upgrade
Once python is installed, you should be able to run the program from the directory by just typing ./mobotix_notifier.py
, obviously after you’ve downloaded the code itself (see below).
Firewalls: Windows and Linux
Linux systems often have Uncomplicated Firewall (UFW) running. The command to open the ports in the firewall to let any camera on the LAN reach the listener is:
sudo ufw allow from 192.168.100.0/24 proto tcp to any port 8008 # if you make a mistake sudo ufw status numbered sudo ufw delete 1
This command allows TCP traffic in from the LAN address (192.168.100.0/24, edit as necessary to match your LAN’s subnet) on port 8008. If a broadcast/UDP version comes along, the firewall rule will change a little. You can also reduce the risk surface by limiting the allowed traffic to specific camera IPs if needed.
On windows, the first time the program is run, either the python script or the executable, you’ll get a prompt like
You probably don’t need to allow public networks, but it depends on how you’ve defined your network ranges whether Windows considers your LAN public or private.
Default Icon Setup
One of the features of the program is to grab the camera’s event image and convert it to the alert icon which provides a nearly uselessly low rez visual indicator of the device reporting and the event that caused the trigger. The icon size itself is 256×256 pixels on linux and 128×128 on windows (.ico). Different window managers/themes provide more or less flexibility in defining the alert icons. Mine are kinda weak.
The win-10 notification makes better use of the icon. Older versions of linux had a notification customization tool that seems to have petered out at 16.x, alas. But the icons have some detail if your theme will show them.
Another feature is that the code creates the icon folder if it doesn’t exist. It almost certainly will on Linux but probably won’t on windows unless you’ve run some other Linuxy stuff on your windows box. The directory created on windows is your home directory\.local\share\icons\
. On Linux systems, the directory should exist and is ~/.local/share/icons/
. In that directory you should copy the default camera icon as “mobotix-cam.ico” like so:
You can put any icon there as your preferred default as long as it is in .ico format, or use the one below (right-click on the image or link and “save as” to download the .ico file with resolution layers):
If, for some reason, the get image routine fails, the code should substitute the above icon so there’s a recognizable visual cue of what the notification is about.
mobotix_notifier.py code
The python code below can be saved as “mobotix_notifier.py
” (or anything else you like) and the execution bit set, then it can be run as ./mobotix_notifier.py
on Linux or python .\mobotix_notifier.py
on Windows. On Linux, the full path to where you’ve installed the command can be set as a startup app and it will run on startup/reboot and just listen in the background. It uses about 13 seconds a day of CPU time on my system.
Click to download the Windows .exe which should download as mobotix_notifier.exe. (14.0MiB) After the above configuration steps of on the camera(s) and firewall are completed it should start silently and run in the background after launch (kill it with task manager if needed) and push desktop alerts as expected. I used “UC” alarms to test rather than waiting for stray cats.
The python code is:
#!/usr/bin/env python3 import requests from PIL import Image import socket from plyer import notification import io import os.path # note windows version needs .ico files # note windows paths have to be r type to handle # backslashes in windows paths # Check operating environment and define path names # for the message icons accordingly. # if OS path doesn't exist, then create it. if os.name == "nt": Ipath = r"~\.local\share\icons\mobotix-cam.ico" Epath = r"~\.local\share\icons\mobotix-event.ico" fIpath = os.path.expanduser(Ipath) fEpath = os.path.expanduser(Epath) dirpath = os.path.dirname(fEpath) if not os.path.exists(dirpath): os.makedirs(dirpath) else: Ipath = "~/.local/share/icons/mobotix-cam.png" Epath = "~/.local/share/icons/mobotix-event.png" fIpath = os.path.expanduser(Ipath) fEpath = os.path.expanduser(Epath) dirpath = os.path.dirname(fEpath) if not os.path.exists(dirpath): os.makedirs(dirpath) def grab_jpeg_image(camera_ip): """Grabs a JPEG image from the specified camera IP.""" # Make a request to the camera IP response = requests.get(f"http://{camera_ip}/control/event.jpg", stream=True) # noqa # Check if the request was successful if response.status_code == 200: # Convert the response data to an image image = Image.open(io.BytesIO(response.content)) # Return the image return image else: # import the default icon image = Image.open(fIpath) # Return the image return image def convert_jpeg_to_png(image, width, height): """Converts a JPEG image to a PNG image.""" # size = width, height # Scale the image image.thumbnail((width, height), Image.Resampling.LANCZOS) # Save the image according to OS convention if os.name == "nt": icon_sizes = [(16, 16), (32, 32), (48, 48), (64, 64), (128, 128)] image.save(fEpath, format='ICO', sizes=icon_sizes) else: image.save(fEpath) def iconify(src_ip): # Grab the JPEG image from the camera image = grab_jpeg_image(src_ip) # Convert the JPEG image to a PNG image convert_jpeg_to_png(image, 256, 256) def reverse_dns_lookup(src_ip): try: return socket.gethostbyaddr(src_ip)[0] except socket.gaierror: return "no dns" except socket.herror: return "no dns" def test_str(answer): try: return str(answer) except TypeError: return answer.to_text() def listener(): """Listens for incoming connections on port 8008.""" # Create a socket sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) # Bind the socket to port 8008 sock.bind(("0.0.0.0", 8008)) # Listen for incoming connections sock.listen(1) while True: # Accept an incoming connection conn, addr = sock.accept() # Receive the payload of the packet data = conn.recv(2048) # Close the connection conn.close() # convert from literal string to remove b' prefix of literal string data = str(data)[2:-1] # Extract the source IP from the address src_ip = addr[0] # Grab the event image as an icon iconify(src_ip) # Do a DNS lookup of the source IP answer = reverse_dns_lookup(src_ip) # Get the hostname from the DNS response hostname = test_str(answer) # Write the hostname to notify-send title = (f"Event from: {hostname} - {src_ip}") message = (f"{data} http://{src_ip}/control/userimage.html") notification.notify( title=title, message=message, app_icon=fEpath, timeout=30, toast=False) # Echo the data to stdout for debug # print(f"Event from {hostname} | {src_ip} {data}") if __name__ == "__main__": listener()
Please note the usual terms of use.
The end of a comic era
Tonight I listened to the last episode of NPRs excellent and hilarious Ask Me Another, though originally broadcast on 2021-09-24, it didn’t reach my ears until tonight thanks to the magic of podcasts. It was genuinely hard to hear them sign off for the last time. I will really miss this show and the warmth and good spirits of Ophira Eisenberg and Jonathan Coulton.
I’ve been listening to this show since it started, back so far as to have been over syndicated FM broadcast on KQED at home and since on various digital media over the years wherever I’ve been, even here in Iraq. It suffered when Covid hit, the energy and charm didn’t translate well to zoom and without an audience as so many things didn’t and sadly didn’t live to see Covid restrictions lifted. It would have been fitting if they’d been able to record their last show at The Bell House one more time. Maybe someday they can have a reunion show.
US Public Radio has been an anchor of good quality programming, from Car Talk, which I still listen to weekly despite the questions being increasingly out of touch (though the cars have long been fairly irrelevant) and Fresh Air and Terry Gross‘ voice, which came from my mother’s kitchen radio every afternoon from WHYY about as far back as I can remember.