Pistons with waits failing [TCP piston state change]


#1

I have a bunch of pistons with waits that are not executing. I’m seeing red ovals in the trace gutter with times incrementing instead of decrementing. Anyone else having this issue?

image

On further review, it seems like those with short waits (3 seconds or so) are fine, where ones that request a wake-up further in the future (45, 60, 90 seconds etc) do not execute.


Time triggers don't appear to work
#2

@bthrock also reported the same issue :slight_smile:


#3

I posted the issue in that thread, will follow updates there instead of here.


#4

Please continue in the topic that you started here, announcement threads are not an appropriate place to debug issues that are not related to the release. Can you post a snapshot of your piston or of a simplified piston that shows the same behavior?


'Do Every' piston running every 3 seconds
#5

Today I noticed a significant change in SmartThings scheduling behavior related to a different issue:

For a few months now any events scheduled over 25 days in the future would reinvoke the smart app immediately rather than at the specified time. This was 100% repeatable behavior and the piston would continue looping for hours to days before the schedule was handled properly. A few of us tested yesterday and it was still occurring, but as of today the scheduling seems to finally be working correctly.

All of that to say, it seems that SmartThings may have made some changes to scheduling yesterday. @destructure00 is still on the old version of webCoRE according to the logs posted on the announcement form so this can be confirmed as a ST platform issue rather than an issue with yesterday’s webCoRE update which did not include any changes to scheduling code.


#6

I noted yesterday that I thought this might be a ST issue, and the reason for that suspicion was that a given piston did not always fail in quite the same way twice. However, the timing relative to the update was oddly coincidental and there were (and as yet, are) no issues posted on status.smartthings.com.

All that notwithstanding, this is a very serious issue that is wreaking havoc with a number of critical pistons, causing them to delay, fail, or randomly refire. It is for me at least easily reproducible, though as I mentioned the behavior can vary somewhat from one execution to the next. Below is an extraordinarily simple piston, fired once by clicking the ‘Test’ button, and you can see the results in the subsequent log.

piston


#7

Last night I did not show any updates available in IDE. This morning I did, updated all 4 apps, hard reloaded dashboard, edited and resaved piston. Still having the same problem. Logs show a wake-up requested, but Quick Facts show no next job scheduled.

image

The wait never executes, time goes positive and turns red in the trace gutter:

Here’s the piston. I have a couple other like this that are exhibiting the same behavior. This all started yesterday afternoon, everything was working fine before then.


#8

The recovery here suggests that something in the piston execution is timing out, coupled with the occasional 10-15 second delay when scheduling works I wonder if it’s the scheduling that times out. I have seen the 10-15 second delay today as well with the simple wait piston.

@destructure00 do you get anything in the logs triggered by a recovery a few minutes later or anything in the Live Logging tab at account.smartthings.com for this piston?


#9

Normally I’d assume it was another clumsy mistake in my code, but even I couldn’t muck up that example! Just for the heck of it, knowing there was no chance it would do any good, I did reboot my hub and reset my webCoRE cache. Yeah, I know … :thinking:

Any suggestions on how to get someone’s attention at ST on this? (Hey @ady624 , where are you? :grinning:) I’m having to pause several key pistons that have now become wholly unreliable.


#10

Live logging in IDE matches the piston log, nothing extra there. I also do not see any recovery happening in the piston logs.


#11

So one of my pistons with waits is working this morning. I do see a difference in the logs between that one and the one that’s not working. This one that’s working has a Trace log that requests a wake up and an info log that sets up a scheduled job:

This one does not work. It has the same trace log requesting a wake up but it’s missing the info log setting up a scheduled job:


#12

We have a few @ST_Staff members here who may get the notification about this change affecting scheduling and take action, but otherwise community.smartthings.com is the best place to reach out since others outside of webCoRE are likely experiencing similar problems. The only topic I see so far that is possibly related is Platform timeouts a lot now from 13 hours ago. I would start by replying on that topic with as much detail as you can muster. ST would need to know which shard you are on (the URL you get redirected to after signing in at account.smartthings.com).

@ST_Community_Devs does anyone have time to whip up a smart app to test this so that an example can be provided that does not rely on webCoRE? If this is a scheduling problem it should boil down to inconsistent behavior of runIn() with short timeframes, whereas before and confirmed to be resolved as of today we were having trouble with schedules > 25 days in the future.


#13

Just for reference here, I am on the https://graph-na02-useast1.api.smartthings.com/ shard.


#14

Thanks, I am on https://graph.api.smartthings.com/


#15

https://graph-na02-useast1.api.smartthings.com here


Piston using a 5 second wait is looping?
#16

The latest reply in the SmartThings community topic suggests that it is definitely a related issue, for anyone that did not see the earlier link we are also discussing this over here:


Piston stops prematurely
#17

I am having some very similar problems with some of my pistons. I created similar rules in both Smart Things and in Smart Rules. Both seemed to execute close to the expected times. My problems also started immediately after I updated to the latest revision. The pistons had been working fine up until that point.


#18

Yesterday’s updates over at ST killed off CoRE as well until we get a few code tweaks out. Seems like a java version update has taken out a number of smart apps. Possibly a contributing factor in these scheduling issues.


#19

“We upgraded major versions of Groovy yesterday for executing SmartApps and DeviceTypeHandlers, which included some rather major changes to the language.”

Seems like their changes have affected quite a few SmartApps :frowning: … would’ve been nice to alert developers :angry:


#20

The ‘wait’ issue seems to have been resolved as of about an hour ago. Will do more confirming tests in the AM.