VERY slow, erratic piston execution, Kukuharmony

Alwas · 2019-02-16 18:58:35 UTC

This is really annoying. I have some pistons triggered by virtual switches, set to turn on various Apps on my TV like Plex. A Raspberry Pi sends individual IR commands to my Samsung TV, controlled by WebCoRE. They’ve been fine for a year, snappy, precise, but now a 3 second wait can be 3 seconds or 10 minutes. I can’t narrow down what the problem is. Everything leads me to think it’s WebCoRE, maybe I have too many pistons, 110, maybe i need to create another instance of WebCoRE?

For example my piston is set to turn off a virtual switch at the end of the piston, it’s been 20 minutes and the virtual switch is still on!

I read somewhere turning on Parallelism would help my situation but is hasn’t.

16/02/2019, 19:37:24 +579ms
+0ms ╔Received event [Home].time/recovery = 1550342244578 with a delay of 0ms
+908ms ║RunTime Analysis CS > 352ms > PS > 504ms > PE > 52ms > CE
+912ms ║Runtime (39510 bytes) successfully initialized in 504ms (v0.3.109.20181207) (910ms)
+913ms ║╔Execution stage started
+924ms ║╚Execution stage complete. (10ms)
+938ms ╚Event processed successfully (937ms)

16/02/2019, 19:37:02 +304ms
+1ms ╔Received event [Home].execute = recovery with a delay of 226ms
+124ms ║RunTime Analysis CS > 16ms > PS > 57ms > PE > 50ms > CE
+126ms ║Runtime (39496 bytes) successfully initialized in 57ms (v0.3.109.20181207) (125ms)
+127ms ║╔Execution stage started
+140ms ║║Cancelling condition #13’s schedules…
+142ms ║║Condition #13 evaluated false (7ms)
+145ms ║║Cancelling condition #1’s schedules…
+146ms ║║Condition group #1 evaluated false (state changed) (13ms)
+171ms ║╚Execution stage complete. (44ms)
+182ms ╚Event processed successfully (182ms)

16/02/2019, 19:37:02 +270ms
+1ms ╔Received event [Home].execute = recovery with a delay of 216ms
+125ms ║RunTime Analysis CS > 17ms > PS > 60ms > PE > 48ms > CE
+128ms ║Runtime (39496 bytes) successfully initialized in 60ms (v0.3.109.20181207) (126ms)
+129ms ║╔Execution stage started
+143ms ║║Cancelling condition #13’s schedules…
+144ms ║║Condition #13 evaluated false (9ms)
+145ms ║║Cancelling condition #1’s schedules…
+146ms ║║Condition group #1 evaluated false (state changed) (11ms)
+172ms ║╚Execution stage complete. (43ms)
+184ms ╚Event processed successfully (184ms)

16/02/2019, 19:36:14 +629ms
+1ms ╔Received event [TV Plex].switch = on with a delay of 394ms
+238ms ║RunTime Analysis CS > 19ms > PS > 158ms > PE > 62ms > CE
+242ms ║Runtime (39497 bytes) successfully initialized in 158ms (v0.3.109.20181207) (240ms)
+244ms ║╔Execution stage started
+258ms ║║Comparison (enum) on changes_to (string) on = true (2ms)
+261ms ║║Cancelling condition #13’s schedules…
+263ms ║║Condition #13 evaluated true (11ms)
+265ms ║║Cancelling condition #1’s schedules…
+267ms ║║Condition group #1 evaluated true (state changed) (15ms)
+272ms ║║Cancelling statement #2’s schedules…
+550ms ║║Executed physical command [TV].custom1() (272ms)
+551ms ║║Executed [TV].custom1 (274ms)
+556ms ║║Executed virtual command [TV].wait (1ms)
+557ms ║║Waiting for 3000ms
+3628ms ║║Executed physical command [TV].custom4() (68ms)
+3629ms ║║Executed [TV].custom4 (70ms)
+3634ms ║║Executed virtual command [TV].wait (0ms)
+3635ms ║║Waiting for 2000ms
+5701ms ║║Executed physical command [TV].custom2() (63ms)
+5702ms ║║Executed [TV].custom2 (65ms)
+5707ms ║║Executed virtual command [TV].wait (0ms)
+5708ms ║║Waiting for 3000ms
+8773ms ║║Executed physical command [TV].custom2() (62ms)
+8775ms ║║Executed [TV].custom2 (64ms)
+8779ms ║║Executed virtual command [TV].wait (1ms)
+8781ms ║║Requesting a wake up for Sat, Feb 16 2019 @ 7:36:25 PM CET (in 2.0s)
+8796ms ║╚Execution stage complete. (8552ms)
+8810ms ║Setting up scheduled job for Sat, Feb 16 2019 @ 7:36:25 PM CET (in 1.973s)
+8820ms ╚Event processed successfully (8821ms)

WCmore · 2019-02-16 19:22:48 UTC

Anytime a piston runs longer than approx 10 seconds, it stops part-way thru, and schedules a wakeup to finish the tasks. (this is a SmartThings limitation, not webCoRE)

Notice in your log it made it all the way down to line 34 before scheduling a wakeup at 7:36:25. (two seconds later)

Normally, the solution to this is to set the TCP to Never on the later blocks, but you have already done this.

To dig any deeper, we’d need to see a current screenshot with trace turned on. It shows extra information that really helps when digging thru the logs.

For what it’s worth, most “lost” or “delayed” commands are due to network congestion, which of course, can happen for many different reasons.

Also, if it were me, I would disable Parallelism. That effectively creates multiple versions of the same piston. That is only helpful if you plan on flipping Switch 24 multiple times before the first one has finished execution.

Alwas · 2019-02-16 20:13:30 UTC

I’ve disabled Parallelism.
But @WCmore could you tell me how to turn on trace please, for the life of me I can’t find it, I’ve google this unsuccessfully too!

WCmore · 2019-02-16 20:19:33 UTC

Load the piston, but do not hit Edit.
Press Test, which should do nothing in this piston.
Then press Trace

A fresh log with Parallelism disabled would be nice too, but it might work right off the bat.
*fingers crossed*

Alwas · 2019-02-16 20:31:25 UTC

Clearly the first time I ran it now it worked right off the back, seems are like faulty cars, you take them to the mechanic and they work.
But running again i still see delays, 1 second is more like 6 seconds, whereas it used to be 1 second. I have been struggling with these tv pistons for a few weeks now, tried with parralelism off and on, and changing other settings, i have noticed more consistently in the mornings around 9 a.m. UK time, when America is asleep, which leads me to think you are right about the ST cloud.

16/02/2019, 21:25:50 +579ms
+0ms ╔Received event [Home].time/recovery = 1550348750578 with a delay of 0ms
+478ms ║RunTime Analysis CS > 180ms > PS > 254ms > PE > 45ms > CE
+481ms ║Runtime (39562 bytes) successfully initialized in 254ms (v0.3.109.20181207) (480ms)
+482ms ║╔Execution stage started
+484ms ║╚Execution stage complete. (1ms)
+486ms ╚Event processed successfully (486ms)
16/02/2019, 21:25:35 +153ms
+1ms ╔Received event [TV Plex].switch = off with a delay of 896ms
+134ms ║RunTime Analysis CS > 27ms > PS > 56ms > PE > 50ms > CE
+136ms ║Runtime (39546 bytes) successfully initialized in 56ms (v0.3.109.20181207) (134ms)
+137ms ║╔Execution stage started
+148ms ║║Comparison (enum) off changes_to (string) on = false (0ms)
+150ms ║║Condition #13 evaluated false (6ms)
+151ms ║║Condition group #1 evaluated false (state did not change) (8ms)
+154ms ║╚Execution stage complete. (16ms)
+155ms ╚Event processed successfully (155ms)
16/02/2019, 21:25:34 +418ms
+1ms ╔Received event [Home].execute = recovery with a delay of 143ms
+136ms ║RunTime Analysis CS > 29ms > PS > 68ms > PE > 39ms > CE
+139ms ║Runtime (39547 bytes) successfully initialized in 68ms (v0.3.109.20181207) (136ms)
+140ms ║╔Execution stage started
+154ms ║║Cancelling condition #13’s schedules…
+155ms ║║Condition #13 evaluated false (9ms)
+156ms ║║Cancelling condition #1’s schedules…
+157ms ║║Condition group #1 evaluated false (state changed) (12ms)
+409ms ║║Executed physical command [TV].chdown() (235ms)
+411ms ║║Executed [TV].chdown (237ms)
+473ms ║║Executed physical command [TV].custom2() (59ms)
+474ms ║║Executed [TV].custom2 (61ms)
+477ms ║║Cancelling statement #14’s schedules…
+499ms ║║Executed physical command [TV Plex].off() (18ms)
+500ms ║║Executed [TV Plex].off (20ms)
+516ms ║║Calculating (string) 21:25 + (string) on >> (string) 21:25 on
+521ms ║║Calculating (string) 21:25 on + (string) >> (string) 21:25 on
+526ms ║║Calculating (string) 21:25 on + (string) 16 >> (string) 21:25 on 16
+530ms ║║Calculating (string) 21:25 on 16 + (string) of >> (string) 21:25 on 16 of
+534ms ║║Calculating (string) 21:25 on 16 of + (string) >> (string) 21:25 on 16 of
+538ms ║║Calculating (string) 21:25 on 16 of + (string) February >> (string) 21:25 on 16 of February
+544ms ║║Calculating (string) since + (string) >> (string) since
+548ms ║║Calculating (string) since + (string) 21:25 on 16 of February >> (string) since 21:25 on 16 of February
+551ms ║║Calculating (string) since 21:25 on 16 of February + (string) >> (string) since 21:25 on 16 of February
+555ms ║║Executed virtual command [TV Plex].setState (0ms)
+557ms ║╚Execution stage complete. (418ms)
+559ms ╚Event processed successfully (559ms)
16/02/2019, 21:24:34 +583ms
+1ms ╔Received event [TV Plex].switch = on with a delay of 783ms
+117ms ║RunTime Analysis CS > 17ms > PS > 60ms > PE > 39ms > CE
+119ms ║Runtime (39546 bytes) successfully initialized in 60ms (v0.3.109.20181207) (117ms)
+120ms ║╔Execution stage started
+131ms ║║Comparison (enum) on changes_to (string) on = true (1ms)
+132ms ║║Cancelling condition #13’s schedules…
+134ms ║║Condition #13 evaluated true (7ms)
+135ms ║║Cancelling condition #1’s schedules…
+136ms ║║Condition group #1 evaluated true (state changed) (10ms)
+138ms ║║Cancelling statement #2’s schedules…
+202ms ║║Executed physical command [TV].custom1() (60ms)
+204ms ║║Executed [TV].custom1 (62ms)
+208ms ║║Executed virtual command [TV].wait (0ms)
+209ms ║║Waiting for 3000ms
+3265ms ║║Executed physical command [TV].custom4() (52ms)
+3266ms ║║Executed [TV].custom4 (56ms)
+3270ms ║║Executed virtual command [TV].wait (1ms)
+3271ms ║║Waiting for 2000ms
+5331ms ║║Executed physical command [TV].custom2() (57ms)
+5332ms ║║Executed [TV].custom2 (59ms)
+5336ms ║║Executed virtual command [TV].wait (0ms)
+5338ms ║║Waiting for 3000ms
+8398ms ║║Executed physical command [TV].custom2() (58ms)
+8399ms ║║Executed [TV].custom2 (60ms)
+8403ms ║║Executed virtual command [TV].wait (1ms)
+8405ms ║║Requesting a wake up for Sat, Feb 16 2019 @ 9:24:44 PM CET (in 2.0s)
+8410ms ║╚Execution stage complete. (8290ms)
+8412ms ║Setting up scheduled job for Sat, Feb 16 2019 @ 9:24:44 PM CET (in 1.995s)
+8421ms ╚Event processed successfully (8420ms)

WCmore · 2019-02-16 20:35:45 UTC

I have hot food at the moment, so I will look at your log afterwards…

For clarity, most of the network congestion I was referring to is local congestion… (inside your local network) For example, if I am doing heavy downloading or streaming, my SmartHome used to be slow to respond. I have since gone into my router settings (QoS or Quality of Service) and set my ST hub to the number one priority. That one trick made a huge difference in my network and SmartHome response times.

If there is a gamer in the house, that can also use a lot of bandwidth, and impact performance.

Just throwing ideas out there

Alwas · 2019-02-16 20:35:50 UTC

Don’t seem to be able to capture a screenshot with the trace though

WCmore · 2019-02-16 20:36:33 UTC

Green camera works with Trace on, and the resulting picture has more information displayed than before

WCmore · 2019-02-16 20:38:08 UTC

Harmony is notorious for this. I try to avoid using it unless there is no other option, and even then, do not expect Harmony to be snappy.

Alwas · 2019-02-16 20:43:27 UTC

I hear you about the Harmony, but i’m not using Harmony Activities, i’m using the Harmony API on a Raspberry to tell Harmony to Zap, and when to Zap, So the latency must be the commands coming back from webcore/ST cloud into my local network to the raspberry…

WCmore · 2019-02-16 21:08:17 UTC

For some reason, I thought that the Harmony API still sent the final commands to the Harmony hub so the keystrokes can be sent across the room via IR/RF…

Either way… the delays could be in any one of the hops…

SmartThings > Raspberry > Harmony API > Zapier > (Harmony Hub?) > Device

Alwas · 2019-02-16 21:20:50 UTC

I’m already getting much better results with parralelism turned off.
What does “Command optimization” do? That’s not helpful?
Next i’m going to take another look at Quality of Service on Advanced Tomato software, i wasn’t impressed with it before, breaking up traffic into arbitrary categories i.e. HTTPS, Media, WWW, Remote, VOIP/Game, Service,Unclassified. The way you made it sound i prefer, just giving the ip address of the ST hub priority over everything else would be neat.

WCmore · 2019-02-16 21:26:17 UTC

Excellent!!

It won’t do anything with this. What is does is for devices, it pushes the commands regardless of what stats the device is in.

For example, the command “Turn off lamp” will not send that command if the lamp is already off… but… if you disable command optimization, it will push the command regardless.

Tomato definitely has that feature. I give the very highest priority to my ST hub based on the MAC address of the hub, but if your IPs are static, you could go that way as well.

Alwas · 2019-02-19 10:19:14 UTC

Testing this continuously I’ve come to a conclusion, and I know it’s heresy to say on this forum, but the inconsistency comes from WebCoRE. This whole setup runs locally apart from WebCoRE. The Harmony hub has been blocked from the internet, so it’s not that, the thing that decides to sends IR commands and when is WebCoRE. With QoS on or off, no internet activity, I even created another instance of WebCoRE just for this piston, same behaviour. My first kind of solution was to limit my wait periods to under 3 seconds, using 2900 milliseconds as a max, and that worked 90% of the time. Something happens at 3 seconds with WebCoRE that makes it not count correctly. The issue here was if the Samsung TV was an extra .5 seconds slow to open the SmartHub page, this wouldn’t work. Plus I had some longer waits opening other programs on the TV so this clearly wasn’t going to work. So i dug out my Broadlink Pro, programmed a scene or macro, and it just worked every time, regardless of the wait period. So I downloaded RM Bridge onto a tablet, copied my Broadlink commands over, re-created some scenes/macros, now in my Piston i just make a Get request and the IR commands come through in order at the correct time, everytime.

WCmore · 2019-02-19 10:38:04 UTC

Oh yea, webCoRE’s timing (waits) are not atomic clock precise.
(slight hiccups at wakeups, especially around every ten seconds)

It is always best to round up your wait times for better reliability.

Another way to think of this:
Does a web page load at precisely the same speed every time?

SmartThings stores all the programming logic in the cloud…
Not sure about the Harmony API, but that is probably online as well
I think the same goes for Zapier…

This means each command passes thru your modem/router four times!

Hub > Modem > SmartThings > Modem > Raspberry > Modem > Harmony API? > Zapier > Modem > Hub > Device

Again, with all that crazy hopping around, it is best to round up your wait times for better reliability. (or find a more direct route)

This sounds much smarter than using Harmony!

Alwas · 2019-02-19 13:18:22 UTC

And I’ve gone from a piston that used 53% memory to 3%!