Author Topic: Watch dog timer.  (Read 22202 times)

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Watch dog timer.
« on: November 21, 2018, 06:45:08 PM »
I have a client that is concerned that his PLC based system is becoming non-responsive.  He has 25 of these systems running in at least 3 different countries. These systems have been running for 3+ years.

I am looking at the SetSystem 255 command.  This is described as an "on-CPU Watch Dog Timer".  I fully understand the concept of this mechanism.

However I am puzzled by the documentation.  The initial description, "If enabled and the CPU goes astray because of noise induced troube, the WDT will reset the CPU when it times out."  This is exactly what I would expect for a WDT.

Here is where I get into trouble.  The documentation talks about "when the programed is stucked [sic] inside a GOTO loop".  Then the documentation states that "the WDT will not kick in if your program is stucked [sic] inside a FOR..NEXT or WHILE..ENDWHILE loop."  So you are telling me that the WDT only works in a GOTO loop but not FOR..NEXT or WHILE..ENDWHILE?  This makes no sense.

Now for the big issue the documentation states, "WDT also will not activate if your program encounters undefined interrupt error..."  

I believe that there is a SetSystem command that allows me to set up an trap for undefined interrupts.  I, currently, can't find where this is documented, but I believe that I can create a CF that could reset the PLC in the event of these interrupts.

Can I use both the WDT and the undefined interrupt trap at the same time?

Can you point me to the documentation for the interrupt trap?

Best regards,

Gary D*ckinson


support

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3174
    • View Profile
    • Internet Programmable PLCs
Re:Watch dog timer.
« Reply #1 on: November 26, 2018, 10:02:01 PM »
Sorry for the late reply due to all the holiday season lethargy...

The firrmware performs WDT reset when the program is inside a FOR.. NEXT loop. Some FOR NEXT loop may take longer than the WDT time-out period to execute and we didn't want the processor to be reset by the WDT when  the program is performing a legitimate FOR..NEXT loop that takes a long time to complete.

We believe more applications could be caught by an unexpected WDT reset if we don't automatically perform a WDT reset inside a FOR NEXT loop.  It is more likely for spaghetti GOTO statements inside a function to go ashray than a structured FOR... NEXT loop or WHILE..ENDWHILE loop, and hence we don't automatically perform WDT reset for the structured loop. We are not purist in this sense but just trying to minimize the potential problem that user may encounter if we let WDT expires inside a structured loop and depend on the programmer to reset the WDT in a timely manner.

You can use the INTRDEF 100, n, 1 to set an undefined interrupt trap. n is the custom function number. This command is described in the help file for INTRDEF.

Yes you can use both WDT and the undefined interrupt trap at the same time. That allows you to catch the undefined interrupt and you can decide if it is safe to let the execution resume or force a processor reboot.
« Last Edit: November 26, 2018, 10:04:25 PM by support »
Email: support@triplc.com
Tel: 1-877-TRI-PLCS

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Re:Watch dog timer.
« Reply #2 on: November 27, 2018, 05:48:31 PM »
Not a worry on holidays. I just came out of gravy induced coma.

I am attempting to test out the Watch Dog timer mechanism on a Nano-10 running R82A firmware.

I have a simple .pc6 program that can detect when the PLC restarts.
I have a single CF that I can trigger from on-line monitoring.  I think that this CF should trigger the WDT mechanism in a bit less than 3 seconds.  But it does not.  The CF runs to completion and the PLC does not reboot.

Can you give me a hint what I don't quite understand.

Code: [Select]
' TimeWaster this code is supposed to start the WDT and then force the
'   WDT to reset the PLC before the CF completes
'
setsystem 255,&hb3      ' EnableWDT 2.67s timeout with Nano-10 R82A firmware

setsystem 255,&h00      ' reset WDT

delay 1000            ' kill 1st second
delay 1000            ' kill 2nd second
delay 1000            ' kill 3rd <-- WDT should reboot PLC about here
delay 1000            ' kill 4th
delay 1000            ' kill 5th
delay 1000            ' kill 6th
delay 1000            ' kill 7th
delay 1000            ' kill 8th
delay 1000            ' kill 9th
delay 1000            ' kill 10th

Best regards,

Gary D*ckinson

support

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3174
    • View Profile
    • Internet Programmable PLCs
Re:Watch dog timer.
« Reply #3 on: November 27, 2018, 09:50:55 PM »
Thanks for reporting the test.

You may have guessed it already - The TBASIC firmware would reset the WDT when you run the DELAY function so multiple DELAY functions executed one after another is not going to trigger the WDT reset.

Our thinking is that if you put a DELAY 1000 or even DELAY 10000 you probably know what you are doing,  and we don't want to jump the gun by forcing a WDT reset on such actions.

Basically we intend for the WDT reset to happen only when something really unexpected take place and we did not have the intention of enforcing the programming discipline on the programmer to remember to do WDT reset everywhere if they choose to enable the WDT.  

We think many programmers probably develop their applications program without the WDT initially and come production time they decide to add in the WDT. If the firmware doesn't reset WDT in most of the programming construct automatically, that will mean the programmer will be required to check the program carefully to add the reset WDT function everywhere that the program may spend too long in a certain loop. Even in the best case there is still a good chance that the programmer may miss out a function or two during their inspection and the machine then get shipped to the customer... now their customers will report that the controller seem to reboot itself every now and then for no good reason - we are probably going to get an earful from irritated programmers...

We hope the above explanation make sense. Thanks for the feedback.



« Last Edit: November 28, 2018, 12:32:31 AM by support »
Email: support@triplc.com
Tel: 1-877-TRI-PLCS

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Re:Watch dog timer.
« Reply #4 on: November 28, 2018, 06:14:03 PM »
Thanks for the info on the delay statement.  

So far I have found that the WDT is reset during the execution of:

  for/next
  while/endwhile
  delay statements

I can get the WDT to reset the PLC with during the excution of the following CF:


' TimeWaster this code is supposed to start the WDT and then force the
'   WDT to reset the PLC before the CF completes
'

setsystem 255,&hb3      ' EnableWDT 2.67s timeout with Nano-10 R82A firmware

setsystem 255,&h00      ' reset WDT


' Ok try and break the WDT using a while/endwhile based using goto statements
'
i = 300                     ' 300 loops of 10 ms is about 3 seconds
@100
if (i = 0)                  ' while(i)
   goto @110
endif   
   RSHIFT DM[32],256         '    this kills 2.0 ms on a Nano-10
   RSHIFT DM[32],256         '    this kills 2.0 ms on a Nano-10
   RSHIFT DM[32],256         '    this kills 2.0 ms on a Nano-10
   RSHIFT DM[32],256         '    this kills 2.0 ms on a Nano-10
   RSHIFT DM[32],256         '    this kills 2.0 ms on a Nano-10
   i = i - 1
goto @100                  ' endwhile

@110
a = 100                     ' loop exit marker


I have, also found that the WDT mechanism is only active during the execution of CF functions. If I enable the WDT and no CFs are invoked from the ladder logic, the WDT does not reset the PLC.

The documentation describes the Watch dog Timer with "If enabled and the CPU goes astray because of noise-induced trouble, the WDT will reset the CPU when it times out. Note that to simplify user's application the TBASIC O/S automatically resets the WDT during its normal execution except when the program is stucked".

Does this mean that if the CPU goes astray and the the ladder logic scanner quits running that the WDT can not reset the PLC?

I am trying to figure out a mechanism to reboot the PLC when it has gone astray.  The WDT doesn't do this. It seems to be enabled during CF execution and not ladder logic execution.

Additionally if the WDT does timeout in a CF it breaks the connection with the on-line monitoring.  I find that I have to disconnect from the PLC and re-connect to the PLC to get on-line monitoring working.

Sorry to be such a pain.  But I am guessing that no one has ever attempted to use the WDT mechanism.

Best regards,

Gary D*ckinson

support

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3174
    • View Profile
    • Internet Programmable PLCs
Re:Watch dog timer.
« Reply #5 on: November 29, 2018, 07:03:46 PM »
If you have enabled WDT but there is no custom functions at all in the program there is no place to run the reset WDT. So again the firmware will automatically reset the WDT at the end of a ladder logic scan since this is considered normal program. If the CPU crash at some unexpected location it is not going to execute the ladder program to the end in order for the WDT to be reset by the ladder logic scanner, and a WDT time-out will occur.

Are you connected to the PLC via the Ethernet? A reboot by WDT time out will leave the Ethernet connection "half-open" and the i-TRiLOGI does not know that the connection is lost until after some repeated time-out. That is why you need to disconnect from the PLC in order to connect again. If i-TRiLOGI is monitoring the PLC via serial port then you should not need to disconnect and reconnect again.



Email: support@triplc.com
Tel: 1-877-TRI-PLCS

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Re:Watch dog timer.
« Reply #6 on: December 04, 2018, 02:27:34 PM »
Thanks for the feedback on the on-line-monitoring via Ethernet. It makes perfect sense.

I have found some of the string functions reset the WDT.  I suspect that strcmp(), mid$() and len() reset the WDT. I had been attempting to use these functions as time wasters.  They are great time wasters but they keep resetting the WDT and this makes them unsuitable for my purposes.

I have, also found that SetLCD statement appears to reset the WDT.  I am very familiar with the timing and interface to these displays (former life). I know that the screen clear command takes close to 15 ms to complete. But, alas, my code based on SetLCD statements failed to cause the WDT to timeout.  

Just to be fair, the LSHIFT and RSHIT statements DO NOT appear to reset the WDT.  I can get these   You cab get a pretty reasonable time delay out of these statements if you have them span a hundreds of registers.

RSHIFT DM32[1],106         '    this kills 1.0 ms on a Nano-10

Best regards,

Gary D*ckinson
« Last Edit: December 04, 2018, 02:30:21 PM by garysdickinson »

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Re:Watch dog timer.
« Reply #7 on: December 05, 2018, 03:56:19 PM »
Thanks for reminding me about the INTRDEF 100,n,1 stuff.

It is not documented anywhere that I have found.  

The help info for INTRDEF from within the CF editor looks like this:

Enable Interrupt Input channel ch.
ch = interrupt channel number (pls refer to PLC installation guide)
fn_num= Custom Function number to execute when interrupt pin changes according to the defined edge. This is the Interrupt Service Routine ISR.
edge = Positive number means rising edge-triggered. 0 or negative number means falling-edge triggered.


No mention of Undefined interrupt trap (ch# 100).

The "help" mentions the "PLC installation guide".  However your website's product documentation section doesn't have any installation guides.

There are user manuals on the website. I have gone through each of the user manuals there is no mention of INTRDEF with ch# 100.

The TL6 Reference Manual does give some generic info on INTRDEF, but suggested that the Ch argument is from 1..8 (physical inputs) only.

The TL7 addendum makes no mention of INTRDEF.

Is there any change to TBASIC/PLC firmware that would allow a user-defined interrupt CF to determine, what interrupt caused the exception?  Or the CF number that was executing when the exception occurred?

Gary D

support

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3174
    • View Profile
    • Internet Programmable PLCs
Re:Watch dog timer.
« Reply #8 on: December 07, 2018, 07:46:25 AM »
Thanks for reminding me about the INTRDEF 100,n,1 stuff.

It is not documented anywhere that I have found.  

The help info for INTRDEF from within the CF editor looks like this:

Enable Interrupt Input channel ch.
ch = interrupt channel number (pls refer to PLC installation guide)
fn_num= Custom Function number to execute when interrupt pin changes according to the defined edge. This is the Interrupt Service Routine ISR.
edge = Positive number means rising edge-triggered. 0 or negative number means falling-edge triggered.


No mention of Undefined interrupt trap (ch# 100).

The "help" mentions the "PLC installation guide".  However your website's product documentation section doesn't have any installation guides.

There are user manuals on the website. I have gone through each of the user manuals there is no mention of INTRDEF with ch# 100.

The TL6 Reference Manual does give some generic info on INTRDEF, but suggested that the Ch argument is from 1..8 (physical inputs) only.

The TL7 addendum makes no mention of INTRDEF.

Is there any change to TBASIC/PLC firmware that would allow a user-defined interrupt CF to determine, what interrupt caused the exception?  Or the CF number that was executing when the exception occurred?

Gary D


You are correct that instead of "Installation Guide" they should be called "User Manual".  I just checked - For FMD88-10 and FMD1616-10 You can find the INTRDEF 100, n mentioned on page 1-14 and 9-2.  There is a syntax error here - it should have been "INTRDEF 100, n, 1" otherwise it wouldn't compile. But you are correct that this is not mentioned in Fx2424 and Fx1616 User Manual and we will get this fixed.


Email: support@triplc.com
Tel: 1-877-TRI-PLCS

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Re:Watch dog timer.
« Reply #9 on: December 07, 2018, 04:01:49 PM »
Thanks for finding the documentation on the User-Defined Run-Time Error Trap.  I couldn't find it in my panic.

I have updated by test program for both the watch dog timer and the user-defined run-time error trap to include the TRI documentation for these mechanisms.

I have some suggestions on how to make these two mechanisms actually useful for PLC applications:
  • Make the error messages written to the PLC "visible" to custom functions.
  • If the user-defined run-time error trap has been setup change

the behavior of the current WDT to call the same trap handler that
is used for the run-time errors.  If you do this, it would be great
if the WDT timeout mechanism updated the LCD with some sort
of error message that would indicate that the problem is WDT timeout
[/list]

Why?  This way the error handler can log the exception that terminated the PLC execution to either EEPROM or the filesystem. This allows debug of PLC systems that are running in the field and not in my lab.  I am looking for ways to monitor 25 systems that are running at remote sites in 3 different countries.

To make the error messages available to the error trap provide some "special" access to the text array that your low-level firmware manages for the LCD display. I know that it is buffered on the PLC because it can be accessed via host link commands even if no LCD is physically connected to the PLC.

Maybe one of these access methods could be provided:
  • Special case the LOAD_EEP$(n) to return the LCD buffer lines with the n has values such as -1,-2,-3,-4.  I think you have already special case LOAD_EEP$(0) to return some string about PLC firmware version.
  • Special case the $$[n] string mechanism to return the LCD buffer info for some value of n that is outside of the range 1..26.  So $$[-1] might reference the first line of text in the LCD buffer
  • Modify the run-time error trap code that writes the error message to the LCD to also copy the error message to Y$..Z$ before executing

the user-defined run-time error trap CF.  These variables are "toast" as the CF can't do much other than reboot the PLC.
[/list]

Best regards,

Gary D*ckinson
« Last Edit: December 07, 2018, 05:46:51 PM by garysdickinson »

garysdickinson

  • Hero Member
  • Posts: 502
  • Old PLC Coder
    • View Profile
Re:Watch dog timer test Code
« Reply #10 on: December 07, 2018, 04:05:04 PM »
Please find attached PLC program that provides documentation on both the watch dog timer and the user-defined run-time error trap mechanisms.

Gary D

support

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3174
    • View Profile
    • Internet Programmable PLCs
Re:Watch dog timer.
« Reply #11 on: December 11, 2018, 08:03:53 PM »
Thanks for the valuable input and we certainly would consider experimenting with your suggestions to see if we could incorporate them into future firmware releases. The FxCPU firmware is pretty stable already and we haven't made any major changes in the past 2-3 years. We shall weigh the merit of the suggested changes (how often they will be used) against the additional coding costs (memory space, developing and testing the additional code and mostly making sure that the additional code will not introduce new problems to the existing system).  We will keep you posted.
Email: support@triplc.com
Tel: 1-877-TRI-PLCS