David Gessel

Mobotix Notifier in Python – get desktop messages from your cameras

Tuesday, June 6, 2023

I wrote a little code in python to act as a persistent, small footprint LAN listener for Mobotix cameras IP Notify events. If such a thing is useful to you, the code and a .exe compiled version are linked/inline. It works on both Windows and Linux as python code. For Windows there’s a humongous (14MB) .exe file use if you don’t want to install Python and mess with the command line in power shell.

Message generated by the Windows Notifier

Mobotix cameras have a pretty cool low-level feature by which you can program via the camera web interface a raw IP-packet event to send to a destination if the camera detects a trigger, for example motion, PIR over threshold, noise level, thermal trigger, or the various AI detectors available on the 7 series cameras. Mobotix had a simple notification application, but some of these older bits of code aren’t well supported any more and Linux support didn’t last long at the company, alas. The camera runs Linux, why you’d want a client appliance to run anything but Linux is beyond me, but I guess companies like to overpay for crappy software rather than use a much better, free solution.

I wanted something that would push an otherwise not intrusive notification when the camera triggered for something like a cat coming by for dinner, pushing a desktop notification. Optimally this would be done with broadcast packets over UDP, but Mobotix doesn’t support UDP broadcast IP Notify messaging yet, just TCP, so each recipient address (or DNS name) has to be specified on each camera, rather than just picking a port and having all the listeners tune into that port over broadcast. Hopefully that shortcoming will be fixed soon.

This code runs headless, there’s no interaction. From the command line just ./mobotix_notifier.py & and off it goes. From windows, either the same for the savvy or double click the exe. All it does is listen on port 8008/TCP and if it gets a message from a camera, reach out and grab the current video image, iconify it, then push a notification using the OS’s notification mechanism which appears as a pop-up window for few seconds with a clickable link to open the camera’s web page. It works if you have one or a 100 cameras, but it is not intended for frequent events which would flood the desktop with annoyance, rather a front door camera that might message if someone’s at the door. In a monitoring environment, it might be useful for signaling critical events.

Mobotix Camera Set Up

On the camera side there are just two steps: setting up an IP-Notify action from the Admin Menu and then defining an Action Group from the Setup Menu to trigger it.

IP Notify Profile

The title is the default “SimpleNotify” – that can be anything.

The Destination addresses are the IPs of the listener machines and port numbers. You can add as many as needed but for now it is not possible to send a UDP broadcast message as UDP isn’t supported yet. It may be soon, I’ve requested the capability and I expect the mechanism is just a front end for netcat (nc) as it would be strange to write a custom packet generator when netcat is available. For now, no broadcast, just IP to IP, so you have to manually enumerate all listeners.

I have the profile set for sequential send to all rather than parallel just for debugging, devices further down the list will have lower latency with parallel send.

The data protocol is raw TCP/IP, no UDP option here yet…

The data type is plain text, which is easier to parse at the listener end. The data structure I’m using reads: $(id.nam), $(id.et0) | Time: $(fpr.timestamp) | Event: $(EVT.EST.ACTIVATED) | PIR: $(SEN.PIR) | Lux: $(SEN.LXL) | Temp: $(SEN.TOU.CELSIUS) | Thermal: $(SEN.TTR.CELSIUS) but it can be anything that’s useful.

Mobotix cameras have a robust programming environment for enabling fairly complex “If This Then That” style operations and triggering is no exception. One might reasonably configure the Visual Alarm (now with multiple Frame Colors, another request of mine, so that you can have different visual indicators for different detected events, create different definitions at /admin/Visual Alarm Profiles), a fairly liberal criterion might be used to trigger recording, and a more strict “uh oh, this is urgent” criterion might be used to trigger pushing a message to your new listeners.

Action Group Push Message

This config should be fairly obvious to anyone familiar with Mobotix camera configuration: it’s configured to trigger at all detected events but not more than once every 5 seconds. given it is pushing a desktop alert, a longer deadtime might be appropriate depending on the specifics of triggering events that are configured.

That’s all that’s needed on the camera end: when a triggering event occurs the camera will take action by making a TCP connection to the IP address enumerated on the selected port and, once the connection is negotiated push the text structure. All we need now is something to listen.

Python Set Up

The provided code can be run as a python “application” but python is an interpreted language and so needs the environment in which to interpret it properly configured. I also provide a compiled exe derived from the python code using PyInstaller, which makes it easier to run without Python on Windows where most users aren’t comfortable with command lines and also integrates more easily with things like startup applications and task manager and the like.

If you’re going to run the python command-line version, you can use these instructions for Windows, or these for Linux to set up Python. Just make sure to install a version more recent than 3.7 (you’d have to work at installing an older version than that). Then, once python is installed and working, install the libraries this script uses in either windows powershell or Linux shell as below. Note that python3 specifies the 3.x series of python vs. 2.x and is only necessary in systems with earlier version baggage like mine.

python[3] -m pip install plyer dnspython py-notifier pillow --upgrade

Once python is installed, you should be able to run the program from the directory by just typing ./mobotix_notifier.py, obviously after you’ve downloaded the code itself (see below).

Firewalls: Windows and Linux

Linux systems often have Uncomplicated Firewall (UFW) running. The command to open the ports in the firewall to let any camera on the LAN reach the listener is:

sudo ufw allow from 192.168.100.0/24 proto tcp to any port 8008
# if you make a mistake
sudo ufw status numbered
sudo ufw delete 1

This command allows TCP traffic in from the LAN address (192.168.100.0/24, edit as necessary to match your LAN’s subnet) on port 8008. If a broadcast/UDP version comes along, the firewall rule will change a little. You can also reduce the risk surface by limiting the allowed traffic to specific camera IPs if needed.

On windows, the first time the program is run, either the python script or the executable, you’ll get a prompt like

Windows Defender Notification for Mobotix Notifier

You probably don’t need to allow public networks, but it depends on how you’ve defined your network ranges whether Windows considers your LAN public or private.

Default Icon Setup

One of the features of the program is to grab the camera’s event image and convert it to the alert icon which provides a nearly uselessly low rez visual indicator of the device reporting and the event that caused the trigger. The icon size itself is 256×256 pixels on linux and 128×128 on windows (.ico). Different window managers/themes provide more or less flexibility in defining the alert icons. Mine are kinda weak.

Linux event notificationThe win-10 notification makes better use of the icon. Older versions of linux had a notification customization tool that seems to have petered out at 16.x, alas. But the icons have some detail if your theme will show them.

Another feature is that the code creates the icon folder if it doesn’t exist. It almost certainly will on Linux but probably won’t on windows unless you’ve run some other Linuxy stuff on your windows box. The directory created on windows is your home directory\.local\share\icons\. On Linux systems, the directory should exist and is ~/.local/share/icons/. In that directory you should copy the default camera icon as “mobotix-cam.ico” like so:

where to put mobotix-cam.ico

You can put any icon there as your preferred default as long as it is in .ico format, or use the one below (right-click on the image or link and “save as” to download the .ico file with resolution layers):

Mobotix Camera M16If, for some reason, the get image routine fails, the code should substitute the above icon so there’s a recognizable visual cue of what the notification is about.

mobotix_notifier.py code

The python code below can be saved as “mobotix_notifier.py” (or anything else you like) and the execution bit set, then it can be run as ./mobotix_notifier.py on Linux or python .\mobotix_notifier.py on Windows. On Linux, the full path to where you’ve installed the command can be set as a startup app and it will run on startup/reboot and just listen in the background. It uses about 13 seconds a day of CPU time on my system.

Click to download the Windows .exe which should download as mobotix_notifier.exe. (14.0MiB) After the above configuration steps of on the camera(s) and firewall are completed it should start silently and run in the background after launch (kill it with task manager if needed) and push desktop alerts as expected. I used “UC” alarms to test rather than waiting for stray cats.

The python code is:

#!/usr/bin/env python3

import requests
from PIL import Image
import socket
from plyer import notification
import io
import os.path

# note windows version needs .ico files
# note windows paths have to be r type to handle
# backslashes in windows paths
# Check operating environment and define path names
# for the message icons accordingly.
# if OS path doesn't exist, then create it.

if os.name == "nt":
    Ipath = r"~\.local\share\icons\mobotix-cam.ico"
    Epath = r"~\.local\share\icons\mobotix-event.ico"
    fIpath = os.path.expanduser(Ipath)
    fEpath = os.path.expanduser(Epath)
    dirpath = os.path.dirname(fEpath)
    if not os.path.exists(dirpath):
        os.makedirs(dirpath)

else:
    Ipath = "~/.local/share/icons/mobotix-cam.png"
    Epath = "~/.local/share/icons/mobotix-event.png"
    fIpath = os.path.expanduser(Ipath)
    fEpath = os.path.expanduser(Epath)
    dirpath = os.path.dirname(fEpath)
    if not os.path.exists(dirpath):
        os.makedirs(dirpath)


def grab_jpeg_image(camera_ip):
    """Grabs a JPEG image from the specified camera IP."""

    # Make a request to the camera IP
    response = requests.get(f"http://{camera_ip}/control/event.jpg", stream=True) # noqa

    # Check if the request was successful
    if response.status_code == 200:
        # Convert the response data to an image
        image = Image.open(io.BytesIO(response.content))

        # Return the image
        return image

    else:
        # import the default icon
        image = Image.open(fIpath)

        # Return the image
        return image


def convert_jpeg_to_png(image, width, height):
    """Converts a JPEG image to a PNG image."""

    # size = width, height

    # Scale the image
    image.thumbnail((width, height), Image.Resampling.LANCZOS)

    # Save the image according to OS convention
    if os.name == "nt":
        icon_sizes = [(16, 16), (32, 32), (48, 48), (64, 64), (128, 128)]
        image.save(fEpath, format='ICO', sizes=icon_sizes)
    else:
        image.save(fEpath)


def iconify(src_ip):

    # Grab the JPEG image from the camera
    image = grab_jpeg_image(src_ip)

    # Convert the JPEG image to a PNG image
    convert_jpeg_to_png(image, 256, 256)


def reverse_dns_lookup(src_ip):

    try:
        return socket.gethostbyaddr(src_ip)[0]
    except socket.gaierror:
        return "no dns"
    except socket.herror:
        return "no dns"


def test_str(answer):
    try:
        return str(answer)
    except TypeError:
        return answer.to_text()


def listener():
    """Listens for incoming connections on port 8008."""

    # Create a socket
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

    # Bind the socket to port 8008
    sock.bind(("0.0.0.0", 8008))

    # Listen for incoming connections
    sock.listen(1)

    while True:
        # Accept an incoming connection
        conn, addr = sock.accept()

        # Receive the payload of the packet
        data = conn.recv(2048)

        # Close the connection
        conn.close()

        # convert from literal string to remove b' prefix of literal string
        data = str(data)[2:-1]

        # Extract the source IP from the address
        src_ip = addr[0]

        # Grab the event image as an icon
        iconify(src_ip)

        # Do a DNS lookup of the source IP
        answer = reverse_dns_lookup(src_ip)

        # Get the hostname from the DNS response
        hostname = test_str(answer)

        # Write the hostname to notify-send

        title = (f"Event from: {hostname} - {src_ip}")
        message = (f"{data} http://{src_ip}/control/userimage.html")

        notification.notify(
            title=title,
            message=message,
            app_icon=fEpath,
            timeout=30,
            toast=False)

        # Echo the data to stdout for debug
        # print(f"Event from {hostname} | {src_ip} {data}")


if __name__ == "__main__":
    listener()

Please note the usual terms of use.

Posted at 08:21:09 GMT-0700

Category: CodeHowToLinuxTechnology

Get a desktop alert when Thunderbird gets constipated

Monday, May 29, 2023

An occasional annoyance with Thunderbird, especially when I travel, is that it gets constipated when the network status changes and won’t send mail. I always send messages in the background with mailnews.sendInBackground = true so they are expected to queue in the local cache file for a bit while I continue productive work and TB does the send negotiation processes in the background without locking me out. Most of the time this works fine but occasionally it hangs and that can be problematic when mail I expected to go out never left my inbox and there’s no notification in TB that the inbox is backed up. Worse, people’s mail clients sort this delayed mail weirdly depending on their local preferences and it may effectively be lost and suddenly everyone seems to be ignoring me, more than usual.

So i wrote a wee python script that will check the Unsent Messages file location and if the file there is larger than a specific size (1 Byte), it sends a message to the desktop using notify-send.

Calling the script from cron every 10 minutes or so uses just 4 msec of CPU according to time, not much of a hit.

You need to know where your local Unsent Messages file is, which TB can tell you if you right click on the Local Folders Outbox and select properties. This path goes into the variable file_name ="..." and you could change the max_size = int(x) to be a larger value than 1 if that’s useful.

How to find the location of your Thunderbird Unsent Messages file

The code to make this work is pretty simple, just:

#!/usr/bin/env python3
import os
import subprocess

# environnement vars
os.environ.setdefault('XAUTHORITY', '/home/user/.Xauthority')
os.environ.setdefault('DISPLAY', ':0.0')
###############################################################
# Enter reasonable values for the Local Folders Outbox which is
# usually in ~/user/.thunderbird/id.user/Mail/Local Folders/Unsent Messages
# and some test file size in bytes, e.g. 1
###############################################################

file_name = "/home/gessel/.thunderbird/idstring.gessel/Mail/Local Folders/Unsent Messages"
max_size = int(1)

def check_file_size(file_name, max_size):
  """Check if a file is larger than a specific size.

  Args:
    file_name: The name of the file to check.
    max_size: The maximum size of the file, in bytes.

  Returns:
    True if the file is larger than max_size, False otherwise.
  """

  file_size = os.path.getsize(file_name)
  return file_size > max_size

def send_notification(message):
  """Send a notification via notify-send.

  Args:
    message: The message to send.
  """

  subprocess.Popen(["notify-send", message])


# Check if the file is larger than the maximum size.
if check_file_size(file_name, max_size):
  # Send a notification.
  message = "Check for constipated mail, the outbox is larger than {} bytes.".format(max_size)
  send_notification(message)

If the test is triggered, you should get a desktop alert:

1x1.trans

Then I created a crontab entry like:

*/10 * * * * /home/gessel/projects/checkoutbox/checkoutbox.py

That’s it. You can test it if want to make sure it will do the right thing by pointing it intentionally at a larger than 0 byte file, assuming your Unsent Messages file is properly 0 bytes most of the time.

I hope you find this useful; please note the usual terms.

Posted at 07:49:17 GMT-0700

Category: HowToLinux

The end of a comic era

Sunday, May 14, 2023

Tonight I listened to the last episode of NPRs excellent and hilarious Ask Me Another, though originally broadcast on 2021-09-24, it didn’t reach my ears until tonight thanks to the magic of podcasts. It was genuinely hard to hear them sign off for the last time. I will really miss this show and the warmth and good spirits of Ophira Eisenberg and Jonathan Coulton.

I’ve been listening to this show since it started, back so far as to have been over syndicated FM broadcast on KQED at home and since on various digital media over the years wherever I’ve been, even here in Iraq. It suffered when Covid hit, the energy and charm didn’t translate well to zoom and without an audience as so many things didn’t and sadly didn’t live to see Covid restrictions lifted. It would have been fitting if they’d been able to record their last show at The Bell House one more time. Maybe someday they can have a reunion show.

US Public Radio has been an anchor of good quality programming, from Car Talk, which I still listen to weekly despite the questions being increasingly out of touch (though the cars have long been fairly irrelevant) and Fresh Air and Terry Gross‘ voice, which came from my mother’s kitchen radio every afternoon from WHYY about as far back as I can remember.

Posted at 17:42:12 GMT-0700

Category: EventsFunnyMediaPositiveReviews

WordPress forward and back navigation I find pleasing

Sunday, May 7, 2023

A new addition to this site is forward-back navigation at the bottom of posts. I’m not sure why I went 15 years or so without thinking about a reader who might want to browse the scintillating and inspiring topics I cover on a semi-regular basis in details and with such stunning insight but it just never occurred to me until I was messing around with creating topically relevant featured images using Stable Diffusion, but in time the thought came and I’m happy with the result.

This guide from wpbeginner was my starting point. I used their wpb_posts_nav function as a basis, but removed the SVG arrows and navigation aid text to simplify presentation. It ends up looking like this, which I simply added to functions.php of my custom child theme using the theme editor just before the final ; and last line ?>. This version is 100% derivative of the wpbbinner code with some minor deletions to my taste and reversing left/right of previous next as I tend to think of time as flowing left-to-right.

function wpb_posts_nav(){
    $next_post = get_next_post();
    $prev_post = get_previous_post();
      
    if ( $next_post || $prev_post ) : ?>
      
         
      
    

There’s a bit of css that adds some styling. I made a few small changes to my preferences and removed the bits no longer needed and appended this to the very end of the style.css that comes with my template.

/* Post Navigation Code */
.wpb-posts-nav {
    display: grid;
    grid-template-columns: 1fr 1fr;
    grid-gap: 50px;
    align-items: center;
    max-width: 1200px;
    margin: 25px auto;
}
  
.wpb-posts-nav a {
    display: grid;
    grid-gap: 20px;
    align-items: center;
}
  
.wpb-posts-nav > div:nth-child(1) a {
    grid-template-columns: 100px 1fr;
    text-align: left;
}
  
.wpb-posts-nav > div:nth-child(2) a {
    grid-template-columns: 1fr 100px;
    text-align: right;
}
  
.wpb-posts-nav__thumbnail {
    display: block;
    margin: 0;
}
  
.wpb-posts-nav__thumbnail img {
    border-radius: 0px;
}

The last step is to add the nav function call to your single.php page. In my page, it works well just after the endwhile, the code is just and for my theme is placed thus:

...


  
...

I’m happy with the results.

https://www.wpbeginner.com/wp-tutorials/how-to-add-next-previous-links-in-wordpress-ultimate-guide/

Posted at 06:51:51 GMT-0700

Category: CodeSelf-publishingTechnology

عيد مبارك

Saturday, April 22, 2023

Eid Mubarak 2023

Posted at 10:50:00 GMT-0700

Category: EventsPlaces

Technology: maximizing individual radius of lethality.

Sunday, February 5, 2023

We like to look forward into the future by pattern matching against history, something human cognition does to reduce reality into space saving symbolic representation which leads us to see cyclic patterns in everything, even random noise. We also tend to allude to Luddites when talking about people with concern for the consequences of advances in technology, including AI, seeing a pattern of fear of novel technologies that so far hasn’t destroyed society and from this we take comfort that, so too, should the fears of AI become at some future time as laughable as fears of powered looms or telegraphs.

I am not so sanguine. While there are plenty of cycles to history: the seasons, feast and famine, periodic embrace and rejection of authoritarianism; there is also continuous trajectories that project into obvious limits: population growth limited by RuBisCO efficiency, the 38ZJ remaining of the 60ZJ battery capacity the carboniferous period so kindly charged up for us (also thanks to RuBisCO), the development of technology.

I find it convenient to think of technology as primarily a tool for amplifying an individual radius of lethality: while humans certainly enjoy non-lethal uses of technology, a primary driver has always been martial (whether marital is superior or subordinate to martial is a subtle question). Perhaps a tertiary fundamental purpose of technology is to reduce human inputs in consumable assets; the demand for such assets also likely having a limit. AI as we have it now has not existed before, this is novel, it is the current tip of the spear of a persistent and exponentially accelerating trend. There’s no known existence proof that technological advancement of a species is survivable.

We might consider a crude analogy (cognizant of the risks and seductive allure of reducing complex systems to symbols) that by considering technology an amplifier of lethality there are other elements of the implied circuit such as feedback which might considered an analogue of the tendency of actualized lethality to engender a lethal response; and irreducible noise which might be considered an analogue of the tendency of a response distribution to consistent inputs resulting in some distributive tail of human response to abnormally embrace lethality as a response to benign inputs. As technology advances, the gain of the system increases and while we might be familiar with an audio system feedback loop being, perhaps painfully, limited to amplifier clipping the only obvious limit to the maximum output of a technologically amplified lethality feedback loop is annihilation.

While AI as currently implemented shares no underlying mechanism by which we recognize sentience, it provides a fairly good illusion of it and it isn’t a given that there’s a meaningful distinction between the shadows of digital puppets and those of humans. We have a tendency to nostalgically cling to the assumption of some metaphysical value in what we hold to be true and legitimate or at least the output of labor intensive process, a distinction which comes and goes.

That AI might result in students escaping the mental rigors of learning recalls the (still ongoing) hand wringing over calculators, a fear which leans into the cyclic nature of history. On the other hand, human brains appear to be shrinking, likely as a consequence of intelligence being less selective in reproductive success, so the Luddites might have been right all along. Most of us do have a calculator with us at all times, despite what our grade school teachers might have said during arithmetic. While the progress of AI as a labor saving device reducing the energy consumption of our most extravagant organ might lead in time to meaningful changes in human capacity, it seems likely that devolution won’t get a chance to progress that far.

Posted at 10:37:06 GMT-0700

Category: Technology

Sidebar featured images only on single post pages

Tuesday, January 24, 2023

After updating to WordPress 6.x and updating my theme (Clean Black based) and then merging the customizations back in with meld (yes, I really should do a child theme but this is a pretty simple theme so meld is fine), I didn’t really like the way the post thumbnails are shown, prefering to keep it to the right. I mean clean black was last updated in 2014 and while it still works fine, but that was a while ago. Plus I had hand-coded a theme sometime in the naughties and wanted to more or less keep it while taking advantage of some of the responsive features introduced about then.

Pretty much any question one might have, someone has asked it before, and I found some reasonable solutions, some more complex than others. There’s a reasonable 3 modification solution that works by creating another sidebar.php file (different name, same function) that gets called by single.php (and not the main page) that has the modification you want, but that seemed unnecessarily complicated. I settled on a conditional test is_singular which works to limit the get_the_post_thumbnail call to where I wanted and not to invoke it elsewhere. A few of the other options on the same stackexchange thread didn’t work for me, your install may be different. What I settled on (including a map call for geo-tagged posts) is:

And I get what i was looking for, a graphical anchor at the top of the single post (but not pages) for the less purely lexically inclined that didn’t clutter the home page or other renderings with a wee bit o php.

Posted at 17:10:36 GMT-0700

Category: HowToLinuxSelf-publishing

LastPass: The Cloud is Public and Ephemeral

Thursday, January 5, 2023

More or less, anytime I’m prompted, I’ll take the opportunity to say “The cloud, like its namesake, is public and ephemeral.” In his article, “A Breach at LastPass Has Password Lessons for Us All,” Brian X. Chen comes about as close as a mainstream press reports can without poking the apple-cart of corporate golden eggs over the wall in revealing how stupid it is for anyone to put any critical data on anyone else’s hardware.

The article covers a breach at LastPass, a password management service which invites users to store their password’s on LastPass’s computers somewhere in exchange for letting LastPass keep track of every website you visit that requires a password. For reasons that are a little hard to understand, rather a lot of people thought this was an acceptable idea and entrusted their passwords to what are likely important web services to some random company and their random employees that nobody using the service has ever met or ever will without any warranty or guarantee or legal recourse at all when the inevitable happens and there’s a data breach.

I suppose they believe that because the site appears to offer a service that looks like an analog of a safety deposit box, that there’d be some meaningful security guarantee just as users of gmail seem to assume that if you use gmail your email will be in some way “secure” and “private,” despite what the CEO of google tells you.

Obviously, LastPass was hacked and, obviously, every users’s secure account list (including their OnlyFans and Grindr accounts) and password database was exposed. This is guaranteed to happen eventually at every juicy target on the internet. It’s just probability: an internet service is exposed to everyone on the planet with a network connection (5,569,029,076 people as of today), and every target is attacked constantly (my own Fail2Ban has blocked 2,899,324 malicious packets) and even if they’re Google, they’re not smarter than the 5B+ people who can take a shot at them any time.

The most hilarious part of this is how idiotically fragile companies make themselves by chaining various “cloud services” into their service provision: LastPass was using a Cloud-Based Backup service that was hacked. People.. people.. that level of stupidity is unforgivable, but sadly not remotely criminal (though it should be). The risk of failure in a chained service increases exponentially with the length of the chain. Every dependency is a humiliation. This goes for developers too.

This breach means at least the attackers know every pr0n website millions of users have accounts on (as well as banks etc.) It isn’t clear how easily the passwords themselves will be exposed and LastPass’s technical description suggests a fairly robust encryption process which should be comforting if your master password is a completely randomly generated string of at least 12 characters you’ve managed to memorize, like n56PQZAeXSN6GBWB. If your password is some combination of dictionary words because you assumed, say, the master password was stored securely and you were only risking the password generator’s random passwords on sites (actually, not a bad strategy if you don’t then screw up security by using a commercial cloud-based password keeper that exposes your master password to global attack, but whatever), well if you did that check have i been pwned regularly for the next year and change every password you have.

The big lesson here is if you put your or your company’s data on someone else’s hardware, it isn’t your data any more it is theirs and you should assume that data is, or will soon be, public. So do not ever put critical data of any sort on anyone else’s hardware ever. It’s just stupid. Don’ t do it.

If you insist on doing so because, say, you’re not an IT person but you’d still like email or you’re a small company who can’t afford to hire an IT person, or who’s CIO has cut some side deals to “cut costs” by firing the IT staff and gifting the IT budget to his buddies running some crappy servers somewhere (and for some reason you haven’t fired that CIO yet), I’d suggest you have your lawyers carefully review recourse in the event of incompetence or malice. My personal starting point is to ask questions like the ones in this post and make sure the answers give comfort that the provider’s liability matches your risk.

What we need is a legal framework that makes every bit of user data a toxic asset. If a computer under your care has other people’s confidential data on it and that data is exposed to any parties not specifically and explicitly authorized by the person to whom the data is pertinent, you should be subject to a penalty sufficient to not just make a person who is harmed by the breach whole, but sufficient to dissuade anyone from ever taking a risk that could result in such an exposure again.

Companies who have business models that involve collecting and storing data about individuals should be required to hold liability insurance sufficient to cover all damages plus any punitive awards that might arise from mishandling or other liability. It is reasonable to expect that such obligations would make cloud services other than fully open/exposed ones with no personal data absurdly unprofitable and end them entirely; and this would be the optimal outcome.

Posted at 17:03:27 GMT-0700

Category: EventsPrivacySecurityTechnology

Some gnuplot and datamash adventures

Thursday, December 29, 2022

I’ve been collecting data on the state of the Ukrainian digital network since about the start of the war on a daily basis, some details of the process are in this post. I was creating and updating maps made with qgis when particularly notable things happened, generally correlated with significant damage to the Ukrainian power infrastructure (and/or data infrastructure). I wanted a way to provide a live update of the feed, and as all such projects go, the real reward was the friends made along the way to an automatically updated “live” summary stats table and graph.

My data collection tools generate some rather large CSV files for the mapping tools, but to keep a running summary, I also extract the daily total of responding servers and compute the day over day change and append those values to a running tally CSV file. A few really great tools from the Free Software Foundation help turn this simple data structure into a nicely formatted (I think) table and graph: datamash and gnuplot. I’m not remotely expert enough to get into the full details of these excellent tools, but I put together some tricks that are working for me and might help someone else trying to do something similar.

Using datamash for Statistical Summaries

Datamash is a great command line tool for getting statistics from text files like logs or CSV files or other relatively accessible and easily managed data sources. It is quite a bit easier to use and less resource intensive than R, or Gnu Octave, but obviously also much more limited. I really only wanted very basic statistics and wanted to be able to get to them from Bash with a cron job calling a simple script and for that sort of work, datamash is the tool of choice.

Basic statistics are easy to compute with datamash; but if you want a thousands grouped comma delimited median value of a data set that looks like 120,915 (say), you might need a slightly more complicated (but still one-liner) command like this:

Median="$(/usr/bin/datamash -t, median 2  < /trend.csv | datamash round 1 | sed ':a;s/\B[0-9]\{3\}\>/,&/;ta')"

Median=               Assign the result to the variable $Median
-t,                   Comma delimited (instead of tab, default)
median                one of a bazillion stats datamash can compute
2                     use column two of the CSV data set.
< /trend.csv          feed the previous command a CSV file nom nom
| datamash round 1    pipe the result back to datamash to round the decimals away
| sed (yadda yadda)   pipe that result to sed to insert comma thousands separator*

*HT @sim

Once I have these values properly formatted as readable strings, I needed a way to automatically insert those updates into a consistently formatted table like this:

Sample statistics chart

I first create a dummy table with a plugin called TablePress with target dummy values (like +++Median) which I then extract as HTML and save as a template for later modification. With the help of a little external file inclusion code into WordPress, you can pull that formatted but now static HTML back into the post from a server-side file. Now all you need to do is modify the HTML file version of the table using sed via a cron job to replace the dummy values with the datamash computed values and then scp the table code with updated data to the server so it is rendered into the viewed page:

sed -i -e "s/+++Median/$Median/g" "stats_table.html"
/usr/bin/sshpass -P assphrase -f '~/.pass' /usr/bin/scp -r stats_table.html user@site.org:/usr/local/www/wp-content/uploads/stats_table.html

For this specific application the bash script runs daily via cron with appropriate datamash lines and table variable replacements to keep the table updated on a daily basis. It first copies the table template into a working directory, computes the latest values with datamash, then seds those updated values into the working copy of the table template, and scps that over the old version in the wp-content directory for visitor viewing pleasure.

Using gnuplot for Generating a Live Graph

The basic process of providing live data to the server is about the same. A different wordpress plugin, SVG Support, adds support for SVG filetypes within WordPress. I suspect this is not default since svg can contain active code, but a modern website without SVG support is like a fish without a bicycle, isn’t it? SVG is useful in this case in another way, the summary page integrates a scaled image which is linked to the full size SVG file. For bitmapped files, the scaled image (or thumbnail) is generated by downsampling the original (with ImageMagick, optimally, not GD) and that needs an active request (i.e. PHP code) to update. In this case, there’s no need since the SVG thumbnail is the just the original file resized—SVG: Scalable Vector Graphics FTW.

Gnuplot is a impressively full-featured graphing tool with a complex command structure. I had to piece together some details from various sources and then do some sedding to get the final touches as I wanted them. As every plot is different, I’ll just document the bits I pieced together myself, the plotting details go in the gnuplot command script, the other bits in a bash script executed later to add some non-standard formatting to the gnuplot svg output.

Title of the plot

The SVG block is set as “Gnuplot” and I don’t see any way to change that from the command line, so I replaced it with the title I wanted, using a variable for the most recently updated data point extracted by datamash as above:

sed -i -e "s/Gnuplot<\/title>/<title>Ukrainian Servers Responding on port 80 from 2022-03-05 to $LDate<\/title>/g" "/UKR-server-trend.svg" sed -i -e "s/<desc>Produced by GNUPLOT 5.2 patchlevel 2 <\/desc>/<desc>Daily automated update of Ukrainian server response statistics.<\/desc>/g" "/UKR-server-trend.svg"</desc></desc>

This title value is used as the tab title. I’m not sure where the will show up, but likely read by various spiders and is an accessibility thing for online readers.

Last Data Point

I wanted the most recent server count to be visible at the end of the plot. This takes two steps: first plot that data point alone with a label (but no title so it doesn’t show up in the data key/legend) by adding a separate plot of just that last datum like:

"< tail -n 1 '/trend.csv'" u 1:2:2 w labels notitle

This works fine, but if you hover over the data point, it just pops up “gnuplot_plot_4” and I’d rather have more useful data so I sed that out and replace it with some values I got from datamash queries earlier in the script like so:

sed -i -e "s/gnuplot_plot_4<\/title>/<title>Tot: $LTot; Diff: $LDif<\/title>/g" "/UKR-server-trend.svg"

Adding Link Text

SVG supports clickable links, but you can’t (I don’t think) define those URLs in the label command. So first set the visible text with a simple gnuplot label command:

set label "Black Rose Technology https://brt.llc" at graph 0.07,0.03 center tc rgb "#693738" font "copperplate,12"

and then enhance the resulting svg code with a link using good old sed:

sed -i -e "s#Black Rose Technology https://brt.llc#Black Rose Technology https://brt.llc#g" "/UKR-server-trend.svg"
Hovertext for the Delta Bars

Adding hovertext to the ends of the daily delta bars was a bit more involved. The SVG type is interpreted by most browsers as a hoverable element but adding visible data labels to the ends of the bars makes the graph icky noisy. Fortunately SVG supports transparent text. To get all this to work, I replot the entire bar graph data series as just labels like so:

'/trend.csv' using 1:3:3 with labels font "arial,4" notitle axes x1y2

But this leaves a very noisy looking graph, so we pull out our trusty sed to set opacity to “0” so they’re hidden:

sed -i -e "s/\(stroke=\"none\" fill=\"black\"\)\( font-family=\"arial\" font-size=\"4.00\"\)/\1 opacity=\"0\"\2/g" "/UKR-server-trend.svg"

and then find the data value and generate a element of that data value using back-references. I must admit, I have not memorized regular expressions to the point where I can just write these and have them work on the first try: gnu’s sed tester is very helpful.

sed -i -e "s/\(\)\([-1234567890]*\)<\/tspan><\/text>/\1\2\2<\/title><\/tspan><\/text>/g" "/UKR-server-trend.svg"

And you get hovertext data interrogation. W00t!

Sample of gnuplot showing hovertext

Note that cron jobs are executed with different environment variables than user executed scripts, which can result in date formatting variations (which can be set explicitly in gnuplot) and thousands separator and decimal characters (,/.). To get consistent results with a cron job, explicitly set the appropriate locale, either in the script like

#!/bin/bash
LC_NUMERIC=en_US.UTF-8
...

or for all cron jobs as in crontab -e

LC_NUMERIC=en_US.UTF-8
MAILTO=user@domain.com
# .---------------- minute (0 - 59) 
# |    .------------- hour (0 - 23)
# |    |      .---------- day of month (1 - 31)
# |    |      |    .------- month (1 - 12) OR jan,feb,mar,apr ... 
# |    |      |    |    .---- day of week (0 - 6) (Sunday=0 or 7)  OR sun,mon,tue,wed,thu,fri,sat 
# |    |      |    |    |
# *    *      *    *    *    

The customized SVG file is SCPd to the server as before, replacing the previous day’s. Repeat visitors might have to clear their cache. It’s also important to disable caching on the site for the page, for example if using wp super cache or something, because there’s no signal to the cache management engine that the file has been updated.

Posted at 05:19:01 GMT-0700

Category: GeopostHowToLinuxTechnology

Smol bash script for finding oversize media files

Friday, September 2, 2022

Sometimes you want to know if you have media files that are taking up more than their fair share of space. You compressed the file some time ago in an old, inefficient format, or you just need to archive the oversize stuff, this can help you find em. It’s different from file size detection in that it uses mediainfo to determine the media file length and a variety of other useful data bits and wc -c to get the size (so data rate includes any file overhead), and from that computes the total effective data rate. All math is done with bc, which is usually installed. Files are found recursively (descending into sub-directories) from the starting point (passed as first argument) using find.

basic usage would be:

./find-high-rate-media.sh /search/path/tostart/ [min bpp] [min data rate] [min size] > oversize.csv 2>&1

The script will then report media with a rate higher than minimum and size larger than minimum as a tab delimited list of filenames, calculated rate, and calculated size. Piping the output to a file, output.csv, makes it easy to sort and otherwise manipulate in LibreOffice Calc as a tab delimited file. The values are interpreted as the minimum for suppression of output, so any file that exceeds all three minimum triggers will be output to the screen (or .csv file if so redirected).

The script takes four command line variables:

  • The starting directory, which defaults to . [defaults to the directory the script is executed in]
  • The minimum bits per pixel (including audio, sorry) for exclusions (i.e. more bpp and the filename will be output) [defaults to 0.25 bpp]
  • The minimum data rate in kbps [defaults to 1 kbps so files would by default only be excluded by bits per pixel rate]
  • The minimum file size in megabytes [defaults to 1mb so files would by default only be excluded by bits per pixel rate]

Save the file as a name you like (such as find-high-rate-media.sh) and # chmod +x find-high-rate-media.sh and run it to find your oversized media.

!/usr/bin/bash
############################# USE #######################################################
# This creates a tab-delimeted CSV file of recursive directories of media files enumerating
# key compression parameters.  Note bits per pixel includes audio, somewhat necessarily given
# the simplicity of the analysis. This can throw off the calculation.
# find_media.sh /starting/path/ [min bits per pixel] [min data rate] [min file size mb]
# /find-high-rate-media.sh /Media 0.2 400 0 > /recomp.csv 2>&1
# The "find" command will traverse the file system from the starting path down.
# if output isn't directed to a CSV file, it will be written to screen. If directed to CSV
# this will generate a tab delimted csv file with key information about all found media files
# the extensions supported can be extended if it isn't complete, but verify that the 
# format is parsable by the tools called for extracting media information - mostly mediainfo
# Typical bits per pixel range from 0.015 for a HVEC highly compressed file at the edge of obvious
# degradation to quite a bit higher.  Raw would be 24 or even 30 bits per pixel for 10bit raw.
# Uncompressed YUV video is about 12 bpp. 
# this can be useful for finding under and/or overcompressed video files
# the program will suppress output if the files bits per pixel is below the supplied threshold
# to reverse this invert the rate test to " if (( $(bc  <<<"$rate < $maxr") )); then..."
# if a min data rate is supplied, output will be suppressed for files with a lower data rate
# if a min file size is supplied, output will be suppressed for files smaller than this size
########################################################################################

# No argument given?
if [ -z "$1" ]; then
  printf "\nUsage:\n  starting by default in the current directory and searchign recusrively \n"
  dir="$(pwd)"
  else
        dir="$1"
        echo -e "starting in " $dir ""
fi

if [ -z "$2" ]; then
  printf "\nUsage:\n  returning files with bits per pixel greater than default max of .25 bpp \n" 
  maxr=0.25
  else
        maxr=$2
        echo -e "returning files with bits per pixel greater than " $maxr " bpp" 
fi

if [ -z "$3" ]; then
  printf "\nUsage:\n  returning files with data rate greater than default max of 1 kbps \n" 
  maxdr=1
  else
        maxdr=$3
        echo -e "returning files with data rate greater than " $maxdr " kbps" 
fi


if [ -z "$4" ]; then
  printf "\nUsage:\n  no min file size provided returning files larger than 1MB \n" 
  maxs=1
  else
        maxs=$4
        echo -e "returning files with file size greater than " $maxs " MB  \n\n" 
fi


msec="1000"
kilo="1024"
reint='^[0-9]+$'
refp='^[0-9]+([.][0-9]+)?$'

echo -e "file path \t rate bpp \t rate kbps \t V CODEC \t A CODEC \t Frame Size \t FPS \t Runtime \t size MB"

find "$dir" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv -iname \*.m4v \) -print0 | while read -rd $'\0' file
do
  if [[ -f "$file" ]]; then
    bps="0.1"
    size="$(wc -c  "$file" |  awk '{print $1}')"
    duration="$(mediainfo --Inform="Video;%Duration%" "$file")"
    if ! [[ $duration =~ $refp ]] ; then
       duration=$msec
    fi
    seconds=$(bc -l <<<"${duration}/${msec}")
    sizek=$(bc -l <<<"scale=1; ${size}/${kilo}")
    sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}")
    rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}")
    codec="$(mediainfo --Inform="Video;%Format%" "$file")"
    audio="$(mediainfo --Inform="Audio;%Format%" "$file")"
    framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")"
    if ! [[ $framerate =~ $refp ]] ; then
       framerate=100
    fi
    rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")"
    width="$(mediainfo --Inform="Video;%Width%" "$file")"
    if ! [[ $width =~ $reint ]] ; then
       width=1
    fi
    height="$(mediainfo --Inform="Video;%Height%" "$file")"
    if ! [[ $height =~ $reint ]] ; then
       height=1
    fi
    pixels=$(bc -l <<<"scale=1; ${width}*${height}*${seconds}*${framerate}")
    bps=$(bc -l <<<"scale=4; ${size}*8/${pixels}")
    if (( $(bc -l <<<"$bps > $maxr") )); then
        if (( $(bc -l <<<"$sizem > $maxs") )); then
            if (( $(bc -l <<<"$rate > $maxdr") )); then
                echo -e "$file" "\t" $bps "\t" $rate "\t" $codec "\t" $audio "\t" $width"x"$height "\t" $framerate "\t" $rtime "\t" $sizem
            fi
        fi
    fi
  fi
done

Results might look like:

Another common task is renaming video files with some key stats on the contents so they’re easier to find and compare. Linux has limited integration with media information (dolphin is somewhat capable, but thunar not so much). This little script also leans on mediainfo command line to append the following to the file name of media files recursively found below a starting directory path:

  • WidthxHeight in pixels (e.g. 1920×1080)
  • Runtime in HH-MM-SS.msec (e.g. 02-38-15.111) (colons aren’t a good thing in filenames, yah, it is confusingly like a date)
  • CODEC name (e.g. AVC)
  • Datarate (e.g. 1323kbps)

For example

kittyplay.mp4 -> kittyplay_1280x682_02-38-15.111_AVC_154.3kbps.mp4

The code is also available here.

#!/usr/bin/bash
PATH="/home/gessel/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

############################# USE #######################################################
# find_media.sh /starting/path/ (quote path names with spaces)
########################################################################################

# No argument given?
if [ -z "$1" ]; then
  printf "\nUsage:\n  pass a starting point like \"/Downloads/Media files/\" \n" 
  exit 1
fi

msec="1000"
kilo="1024"
s="_"
x="x"
kbps="kbps"
dot="."

find "$1" -type f \( -iname \*.avi -o -iname \*.mkv -o -iname \*.mp4 -o -iname \*.wmv \) -print0 | while read -rd $'\0' file
do
  if [[ -f "$file" ]]; then
    size="$(wc -c  "$file" |  awk '{print $1}')"
    duration="$(mediainfo --Inform="Video;%Duration%" "$file")"
    seconds=$(bc -l <<<"${duration}/${msec}")
    sizek=$(bc -l <<<"scale=1; ${size}/${kilo}")
    sizem=$(bc -l <<<"scale=1; ${sizek}/${kilo}")
    rate=$(bc -l <<<"scale=1; ${sizek}/${seconds}")
    codec="$(mediainfo --Inform="Video;%Format%" "$file")"
    framerate="$(mediainfo --Inform="General;%FrameRate%" "$file")"
    rtime="$(mediainfo --Inform="General;%Duration/String3%" "$file")"
    runtime="${rtime//:/-}"
    width="$(mediainfo --Inform="Video;%Width%" "$file")"
    height="$(mediainfo --Inform="Video;%Height%" "$file")"
    fname="${file%.*}"
    ext="${file##*.}"
    $(mv "$file" "$fname$s$width$x$height$s$runtime$s$codec$s$rate$kbps$dot$ext")
  fi
done

If you don’t have mediainfo installed,

sudo apt update
sudo apt install mediainfo
Posted at 10:18:58 GMT-0700

Category: AudioHowToLinuxvideo