openHab2 startup chaos

Thursday, October 25. 2018

starting and stopping openhab2 is a messy affair, filling the logs with loads of Java's verbose exception messages and since the order of loaded models is basically random things may or may not work as intended when items have not been loaded or rules not executed that are assumed to be there.

As it was manual corrections and reloads were needed after each restart.

There is an elegant workaround discussed at community.openhab.org/t/cleaning-up-the-startup-process-renaming-rules-windows-possible/38438/9 which uses systemd's

ExecStartPre and ExecStartPost commands to deactivate all rules before starting openhab intially, and then reactivate them one by one when the system is up and has it's feet on the ground. a small bash script does the renaming. Since the openhab2.service file is part of the package and thus is replaced with each update modifying the .service file would need to be repeated whenever a new version gets installed, but, systemd provides a way to override a service definition. systemctl edit openhab2.service creates an override dir and opens the editor to allow creating service file that is not replaced by the next update. Since the bash script works the list of rules -files in an alphanumerical order a naming scheme like praefixing all .rules with a three digit number finally allows to control the order of rules loading.

 

The bash script sits at some place like /etc/openhab2/exec-scripts/and looks like:

 

 

#!/bin/bash
#######################
#
# clean up the start process 
#  starting rules in a sorted manner after openhab2 got it's feet on the ground
#  called from systemd pre and post start 
#  to rename *.rules away initially and then rename them back one by one
#  
#  /etc/systemd/system/openhab2.service.d/.#override.conf041cd66becaa4539
#
# $3 allows to distinguish between pre and post action for other THings to eXecute
#
# REF: https://community.openhab.org/t/cleaning-up-the-startup-process-renaming-rules-windows-possible/38438/9


ORG=$1
NEW=$2
THX=$3
DUR=1
#IGNORE=moverules.rules

for f in /etc/openhab2/rules/*.${ORG};
do
    CURRENT=$(basename $f)
    if [ "$CURRENT" == "$IGNORE" ]    
    then
        echo "ignoring $IGNORE"
    else
        OLDFILE=$f
        NEWFILE=${f%$ORG}$NEW
        mv "$OLDFILE" "$NEWFILE"
    fi
    # let some time between each load
    if [ "$THX" == "POST" ]
    then
        /bin/sleep $DUR
    fi
done

# some things left on startup
if [ "$THX" == "POST" ]
then
    /usr/bin/touch /etc/openhab2/tradfri.things
fi



 

The relevant lines in the [Service] part of the openhab2.service file are:

 

 
ExecStartPre=/etc/openhab2/exec-scripts/move_rules_at_start.sh rules rules_ 
ExecStart=/usr/share/openhab2/runtime/bin/karaf $OPENHAB_STARTMODE 
ExecStop=/usr/share/openhab2/runtime/bin/karaf stop 
ExecStartPost=-/bin/sleep 90 
ExecStartPost=/etc/openhab2/exec-scripts/move_rules_at_start.sh rules_ rules POST 
TimeoutStartSec=180  

 

(The lines with ExecStart / ExecStop are already there.)

dash ruft smartHome

Wednesday, March 7. 2018

Und wie ?

das Teil hat eine recht kleine Batterie und Amazon hat die Button deshalb so ausgelegt, dass sie nur nach dem Knopfdruck kurz booten, ins WLAN gehen, ihre Message abgeben und sich dann wieder runterfahren.
Bei der Ersteinrichtung lernt das dash die SSID und das Passwort für das Wlan, aber die IP holt es sich jedesmal vom dhcpd.
Und dabei kann man den Button erwischen.

Im Prinzip konfigurierst du dir den dash erst mal so, wie Amazon vorgibt. (Wähle dir einen dash, der Bestellungen ~5€ von Produkten erlaubt, die du gebrauchen kannst. Du bekommst nicht 4,99 erstattet sondern auf die erste Bestellung bis zu 4,99€ Rabatt. Zu billig darf das Produkt also nicht sein) 
Und ja, einmal bestellen, um den Rabatt zu kassieren. (Persönlich finde ich, dass A. da ausreichend 'Ne-doch-nich'-Schwellen und Kindersicherungen eingebaut hat. )
Dann setzt du den Dash zurück und gehst den Weg noch einmal, aber den letzten Schritt: Produktwahl, lässt du aus. Schon hast du einen Dash, der funktioniert, aber nicht richtig bestellen kann. An sich genau, was wir wollen. A. will das nicht und schickt pro Klick auf den dash eine Mail...

Jetzt ist es an der Zeit, in /var/log/daemon.log (oder wo bei dir der dhcpd seine logs ablegt) nachzuschauen, welche MAC der dash benutzt. Und für die MAC legt du eine feste IP an. 
Mit WireShark kannst du nun zuschauen, was der dash mit A. so zu besprechen hat. Und dann machst du in der firewall genau diese Verbindung zu. Aus ist's mit den Mails.

So, jetzt fehlt eigentlich nur noch die Hauptsache. Etwas muss im WLAN die Fühler hochhalten und die dhcp-requests des Dash erschnüffeln. python mit scapy können sowas. Wie genau, tja, da gibt es einen Rüstungswettlauf zwischen A. und Technikforschern. Also googeln und durchprobieren. (siehe unten)

Letztlich hast du etwas, das nach ~3 sec einen event auf deinem smarthome*-Bus auslösen kann, dann ~10 Sekunden Unerreichbarkeit (der dash versucht verzweifelt, A. zu erreichen und schmollt anschliessend noch etwas.)

Alles in allem ein konkurrenzlos günstiger Wifi-Switch mit heftiger Latenz.

 Bei mir tut dies:

#!/usr/bin/python
#############################################
from scapy.all import *
import sys, os

def arp_display(pkt):
    if pkt.haslayer(DHCP):

        if pkt[Ether].src == "fc:65:de:bb:xx:yy": 
            os.system('/usr/bin/curl --silent --header "Content-Type: text/plain" --request POST --data "ON" http://192.168.55.44:8080/rest/items/dashConnect_01')

        if pkt[Ether].src == "fc:a6:67:cc:xx:yy": 
            os.system('/usr/bin/curl --silent --header "Content-Type: text/plain" --request POST --data "ON" http://192.168.55.44:8080/rest/items/dashConnect_02')

print sniff(prn=arp_display, store=0,count=0)


und damit das als daemon läuft und auch nach einem Neustart des raspi wieder läuft, habe ich es als einen service bei systemd angelegt:
/lib/systemd/system/dashsniffer.service 



[Unit]                                                                                                                                                                                            
Description=dash dhcp Paket Sniffer                                                                                                                                                              
After=multi-user.target                                                                                                                                                                           
                                                                                                                                                                                                  
[Service]                                                                                                                                                                                         
Type=simple                                                                                                                                                                                       
ExecStart=/usr/bin/python /var/local/scripts/dash/dashSniffer.py                                                                                                                                               
Restart=on-abort                                                                                                                                                                                  

 
[Install]
WantedBy=multi-user.target


Debian 9.2

Apache/2.4.25 (Debian)
Nextcloud 12.0.3
 
https://github.com/husisusi/officeonlin-install.sh/blob/master/README.md
https://github.com/husisusi/officeonlin-install.sh/issues/135
 
mkdir a folder for the script, clone or download the .zip
open officeonline-install.cfg with an editor and adapt the POCO parameters like ExaconAT shared it in the issue 135 (link above)
 
let the script run (started it w/o any parameters). It took about 80 min on my box.
I had to restart the script once because it complained over an inconsistency in /etc/group which needed to be fixed first. After restart the script managed to resume where it had stopped before.
 
created a subdomain for the lool (LibreOffice OnLine) to run in in DNS, a valid certificate for https, a virtual host config to tell apache how to proxy. Check that Apache has the required modules enabled:
proxy_wstunnel proxy proxy_http ssl
 
For the latter I used the example Nextcloud recommends in https://nextcloud.com/collaboraonline/ and this was a source of problems later. It took me hours of try and error & recherche to learn that all the ProxyPassReverse lines need to have a trailing slash at the end of the right hand argument. The relevant bit of the config I use is:
 
# Encoded slashes need to be allowed
AllowEncodedSlashes NoDecode

# Container uses a unique non-signed certificate
SSLProxyEngine On
SSLProxyVerify None
SSLProxyCheckPeerCN Off
SSLProxyCheckPeerName Off

# keep the host
ProxyPreserveHost On

# static html, js, images, etc. served from loolwsd
# loleaflet is the client part of LibreOffice Online
ProxyPass           /loleaflet https://127.0.0.1:9980/loleaflet retry=0
ProxyPassReverse    /loleaflet https://127.0.0.1:9980/loleaflet/

# WOPI discovery URL
ProxyPass           /hosting/discovery https://127.0.0.1:9980/hosting/discovery retry=0
ProxyPassReverse    /hosting/discovery https://127.0.0.1:9980/hosting/discovery/

# Main websocket
ProxyPassMatch "/lool/(.*)/ws$" wss://127.0.0.1:9980/lool/$1/ws nocanon

# Admin Console websocket
ProxyPass   /lool/adminws wss://127.0.0.1:9980/lool/adminws

# Download as, Fullscreen presentation and Image upload operations
ProxyPass           /lool https://127.0.0.1:9980/lool
ProxyPassReverse    /lool https://127.0.0.1:9980/lool/
 
The script creates a self signed certificate for lool in /etc/loolwsd
This is not helpful when things do not work at once because chrome and firefox are so very strict against self signed certs and any direct tests of the lool subdomain and the virtual host there are hindered. I got around this by changing the path to the certificate (cert, ca-chain privkey) to point at my valid cert by edit in /opt/online/loolwsd.xml. An increased log level here and file enable="true" are other useful settings here. 
 
Things look fine when https://lool.domain.tld:9980 results in an empty page with ok and https://lool.domain.tld:9980/hosting/discovery returns an xml of wopi discovery. If the latter gives a 500 then the proxy config may be the reason.
 
In nextcloud admin Collabora Online the entry is https://lool.domain.tld:9980 - during my recherche I found several discussions with recommendations to not include the port, but this only adds another error.
 
Sources of error messages included nextcloud admin logging (connection refused as long as the proxying doesnt work), /var/log/loolwsd.log (needs log level increased and file enabled) and of course the messages of systemctl status  loolwsd and systemctl status apache2
 
 
 
(Page 1 of 1, totaling 3 entries)