Every Day At...Running in the past?


#1

I have a piston that is supposed to run every few minutes (chargeCheckIntervalMinutes is 17):

When I edit it, it gets scheduled normally. However, after a day or two, it starts to be scheduled in the past! E.g. right now is it September 20th at 11:28am, but the most recent log says:



Note the second to last line - it scheduled it for Friday, September 18th - two days ago, plus 17 minutes :confused:

In my dashboard it is shown to be scheduled for almost two days in the future (1d 23:34:35), even though the code is supposed to run much more frequently.

What I would expect is it would run every day at 12:01am, plus every 17 minutes as calculated in the variable.


#2

I think we are going have to have see the full piston and a log with trace turned on. Check here for instructions.


#3

In addition to @Pantheon’s advice, it is important to keep in mind that changing
{chargeCheckIntervalMinutes}
while the piston is running, will NOT be used on the next run…
but it should catch up by the second run… (unless that run also changes the variable)

This means that “Every {var} mins” will always be using old data, so depending on how often you change it, it is quite possible that it may never be accurate.


For example, lets say the variable starts off at 30.

1:00 = Piston runs (variable is untouched)
1:30 = Piston runs (variable is increased to 40)
2:00 = Piston runs (10 minutes early)
2:40 = Piston runs (on time, but out of sync)

The same concept applies when reducing:

1:00 = Piston runs (variable is untouched)
1:30 = Piston runs (variable is decreased to 20)
2:00 = Piston runs (10 minutes late)
2:20 = Piston runs (on time, but out of sync)


Once you understand the flow above, the real nightmare begins if it changes on every execution.
(IE: Accuracy drops to 0%)

In this example, let’s start with a 10 min variable:

1:00 = Var was 10… Actual delay was 10… New var set to 15
1:10 = Var was 15… Actual delay was 10… New var set to 20
1:25 = Var was 20… Actual delay was 15… New var set to 25
1:45 = Var was 25… Actual delay was 20… New var set to 05
2:10 = Var was 05… Actual delay was 25… New var set to 15
2:15 = Var was 15… Actual delay was 05… New var set to X

Notice how every single trigger fired at the wrong time except for the very first execution.


#4

Below is the full piston.

“Dimmer 1” is a Tesla Powerwall battery
“Presence Sensor 1” is a Tesla car.

ChargeCheckIntervalMinutes is computed based on some constants. I could precompute it and just put an integer in, but since I change my car’s charging rate now and then I wanted to have it as a variable I could easily manipulate.

Here’s a run with trace turned on:
9/20/2020, 8:59:29 PM +71ms
+1ms ╔Received event [Orchard].time/recovery = 1600660769071 with a delay of 0ms
+10221ms ║Calculating (integer) 48 * (integer) 240 >> (integer) 11520
+10224ms ║Calculating (integer) 11520 / (integer) 1000 >> (integer) 11.52
+10228ms ║Calculating (integer) 11 * (integer) 100 >> (integer) 1100
+10230ms ║Calculating (integer) 1100 / (integer) 40 >> (integer) 27.5
+10236ms ║Calculating (integer) 60 * (integer) 10 >> (integer) 600
+10238ms ║Calculating (integer) 600 \ (integer) 27 >> (integer) 22
+10247ms ║Calculating (integer) 100 * (integer) 7 >> (integer) 700
+10249ms ║Calculating (integer) 700 * (integer) 22 >> (integer) 15400
+10252ms ║Calculating (integer) 15400 \ (integer) 60 >> (integer) 256
+10254ms ║Calculating (integer) 256 \ (integer) 40 >> (integer) 6
+10256ms ║Calculating (integer) 85 - (integer) 6 >> (integer) 79
+10261ms ║RunTime Analysis CS > 46ms > PS > 10133ms > PE > 81ms > CE
+10261ms ║Piston waited at a semaphore for 10128ms
+10264ms ║Runtime (47324 bytes) successfully initialized in 10133ms (v0.3.110.20191009) (10262ms)
+10265ms ║╔Execution stage started
+10265ms ║╚Execution stage complete. (1ms)
+10267ms ║Setting up scheduled job for Fri, Sep 18 2020 @ 11:55:20 AM PDT (in 1s), with 2 more jobs pending
+10278ms ╚Event processed successfully (10278ms)

Note that it is planning it’s next run 2 days, 9 hours and 5 minutes in the past, since it is 9:01pm Sunday night the 20th.


#5

Based on your description, it is likely harmless in that scenario then…
Just realize that if it changes programmatically, it will take two executions to “get it right”.

(I guess I could have just typed that… LOL)


#6

I just noticed that currently, at least six math formulas need to be run before the piston can even begin…

Hmmm…

Something tells me this may be a clue for the “time travel” errors you have been having.

My advice would be to keep all math 100% out of the “define” section.
(Leave them all (no value set) up top)

Then, inside the “execute” bliock… (somewhere near the top)… Set those variables using the “Set variable” command.

The numbers will still be easy to “tweak” on your end…
but this method has at least two major benefits.

(1) The answers to all math formulas will always be clearly seen up top.

(2) No sluggish starts, since it will be a simple “Every X min”.


To say this another way, now that you realize that it is always executing based on previous data, I think you will find your triggers much snappier if there is zero calculations done outside of the body of the piston.

With the define section (no value set), then
Every {var} min … (IE: a previously captured integer of 25)
becomes a hundred times smoother than:
Every 60 * 10 / ((40 * 240 / 1000) * 100 / 40) min

For clarification:
It is totally OK to do complex math here in webCoRE… I am really just trying to encourage you to remove it from the “define” section if you are using a trigger based on it.


Side Note:

The log you posted was a “recovery” event, so the end results are often not “normal”.
(also note the semaphore delays)


… and of course

Editing/Saving or Pausing/Resuming the piston should clean up any old schedules.
(but will not resolve semaphore issues)

pic

9/20/2020, 8:59:29 PM … Setting up scheduled job for Fri, Sep 18 2020 @ 11:55:20 AM


#7

I made it 22. So did your piston according to the logs.

The piston is still trying to schedule a run for 22 minutes after you last saved the piston, which should have been its first scheduled run.

It seems to be getting terribly confused by recovery events.


#8

Thanks for the feedback. I guess declaring constants in variables is confusing to webcore somehow, or perhaps it is recovery events?

I tried migrating the constants into variable assignments in the body but still saw time travel, so I’ve gone ahead and just removed them all and move hardcoded numbers into my code. Fingers crossed that it fixes things. I guess no more constants for me :confused:


#9

For the record, I have some insane math formulas in hundreds of my pistons.
(but they are all done inside the execute section of those pistons)

Once they run once, the variables populate up top with integers, and that is referenced for the next trigger.


#10

So I simplified it, removing nearly all the computations:

It is still time traveling - right now it is stuck at when I edited it this morning:

9/21/2020, 6:48:18 PM +92ms
+1ms ╔Received event [Orchard].time = 1600714381548 with a delay of 24916544ms
+7612ms ║RunTime Analysis CS > 17ms > PS > 7543ms > PE > 52ms > CE
+7613ms ║Piston waited at a semaphore for 7539ms
+7615ms ║Runtime (41855 bytes) successfully initialized in 7543ms (v0.3.110.20191009) (7614ms)
+7616ms ║╔Execution stage started
+7617ms ║╚Execution stage complete. (1ms)
+7619ms ║Setting up scheduled job for Mon, Sep 21 2020 @ 11:53:01 AM PDT (in 1s), with 2 more jobs pending
+7628ms ╚Event processed successfully (7628ms)
9/21/2020, 6:48:07 PM +46ms
+0ms ╔Received event [Orchard].time = 1600714381548 with a delay of 24905497ms
+8238ms ║RunTime Analysis CS > 17ms > PS > 8169ms > PE > 52ms > CE
+8239ms ║Piston waited at a semaphore for 8163ms
+8241ms ║Runtime (41855 bytes) successfully initialized in 8169ms (v0.3.110.20191009) (8240ms)
+8242ms ║╔Execution stage started
+8243ms ║╚Execution stage complete. (1ms)
+8245ms ║Setting up scheduled job for Mon, Sep 21 2020 @ 11:53:01 AM PDT (in 1s), with 2 more jobs pending
+8252ms ╚Event processed successfully (8253ms)

It looks like there is a semaphore problem where a run starts and then gets blocked, trying and retrying a semaphore forever, and that’s why it always is waiting for an event in the past, because a run from back then never completes?

I’ve turned on tracing again and restarted it, hoping that shows me where the semaphore is.


#11

Tracing didn’t yield much (any?) more info:

9/21/2020, 7:08:05 PM +407ms
+1ms ╔Received event [Orchard].execute = recovery with a delay of 86ms
+6894ms ║RunTime Analysis CS > 1062ms > PS > 5779ms > PE > 52ms > CE
+6895ms ║Piston waited at a semaphore for 5761ms
+6897ms ║Runtime (41849 bytes) successfully initialized in 5779ms (v0.3.110.20191009) (6895ms)
+6898ms ║╔Execution stage started
+6918ms ║╚Execution stage complete. (20ms)
+6920ms ║Setting up scheduled job for Mon, Sep 21 2020 @ 7:05:19 PM PDT (in 1s), with 2 more jobs pending
+6929ms ╚Event processed successfully (6929ms)
9/21/2020, 7:06:59 PM +77ms
+0ms ╔Received event [Orchard].time/recovery = 1600740419076 with a delay of 1ms
+9547ms ║RunTime Analysis CS > 2200ms > PS > 7292ms > PE > 54ms > CE
+9548ms ║Piston waited at a semaphore for 7263ms
+9550ms ║Runtime (41859 bytes) successfully initialized in 7292ms (v0.3.110.20191009) (9549ms)
+9551ms ║╔Execution stage started
+9552ms ║╚Execution stage complete. (1ms)
+9554ms ║Setting up scheduled job for Mon, Sep 21 2020 @ 7:05:19 PM PDT (in 1s), with 2 more jobs pending
+9563ms ╚Event processed successfully (9562ms)

Still stuck in the past. The semaphore and recovery events seem to be a problem as y’all mentioned. How do I get more info what those are about?


#12

Removing the clutter helps to pinpoint the issue…

Off topic note about piston “teto”…

Line 45 and 57 are expecting integers, but none are to be found.


Back on topic now… I think it would be helpful to fix the two lines mentioned above, save the piston, clear the old log, set logging to Full, and let it run for a few hours.

I would really like to see it from the get-go… Where it starts off working well, but then goes haywire.

Having that kind of data trail will be very helpful.


For large logs, you can select the text, and press here:

pic

… to keep your thread tidy.


#13

I’m pretty sure I’ve found the issue - it isn’t one with WebCore per se. One of the smartapps/devices I was using, Tesla Powerwall Manager, is no longer working per https://community.smartthings.com/t/release-tesla-powerwall-manager-smart-app-and-device-handler/170347/26

I’m guessing that the Tesla folks saw too much traffic coming from the SmartThings cloud and banned the IPs.

One thing WebCore could do a lot better in these situations is to at least indicate what smartapp or device the semaphore is coming from. That would tremendously help in debugging such things.

In any case, thank you all for your patient help in trying to figure this out!