-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EPC timers no longer working after long SMART test #126
Comments
We have an existing thread on a similar issue. Please have a look into that, it might help. |
Thanks for your response. It seems odd that it magically starts working again after a certain amount of hours. However I'm not really able to let them run full tilt all the time until it fixes itself due to temperature and power constraints. |
Hi @luukrijnbende, I have been asking around about this to try and get an idea what is happening. There is not enough information to know for sure, but the best guess is that the SMART self-test (long DST) paused some background activity that normally runs based on timing, so once the drive finished the SMART self-test the drive has been trying to catch up and finish that background work when it can. There are lots of different kinds of background tasks in the drive, some are run periodically in allowed power states (active, idle_a) and others get scheduled as needed when the drive is used (reads, writes, etc). Background activity can be scheduled if it is for health monitoring, performance, reliability, or data-integrity reasons, so it is also entirely possible that something else triggered it that may not even be related to the SMART self-test. |
@luukrijnbende I also hit this, after copying a few TB of data to some new Seagate IronWolf drives. Did the issue eventually fix itself for you? (see my comment here) |
I have a similar situation ...I ran extended self test that took 12h to complete and disk are no longer going to idle. |
Sorry I did not see your comment sooner. And if anyone else has seen this issue, has is occurred after a short DST? Or only after a Long DST? |
They're still not going to sleep, but since I posted they did spend 12h a day (overnight) in standby_z (manually transitioning them). The first 3 days I left them in active mode hoping they would finish whatever they were busy, but since them I put a script to put them to standby_z manually for the night. (And they do stay in that mode so the host is definitely not issuing any stray commands)
I did reboot for sure. I think I did a full power off but I'm not 100% sure anymore. I will retry that when I get the chance. |
I shutdown the machine. Even unplugged it from the wall and let it sit for a bit. |
@smunaut, Would you mind telling me more about the system hardware? Anything you can share would be helpful for us to try and figure out what is causing this.
I know that not all of these things can be shared, however any additional system information that you can share could be helpful to see if it is something we can try repeating the issue on. |
It's a small NAS exposing volumes through NFS and Samba.
|
Some other info :
|
Thank you for that info! The original issue also listed AMD hardware. I will see if I can repeat this on any AMD hardware...maybe it's related, maybe it's not. Would you mind sharing the output of |
B560 is an Intel Chipset 😁 (here running an i3-10105). Yeah B550 is AMD, B560 is Intel confusing I know ...
|
🤦 ...I have an MSI motherboard at home with an almost identical name....MSI MPG B550 Gaming Edge Wifi Anyways, thanks for the additional info. So it does not seem to be a hardware unique issue and looks like the standard AHCI driver is in use, so nothing specific to a driver that I have heard of before. |
I have noticed that the EPC timers have started working again. When this first started happening I wrote a little script to monitor ZFS activity and put the drives into idle states if there is none, now after a reboot that script didn't come up again but EPC did work. I have no idea what the trigger was, maybe they finished their background tasks? Though I did notice today that they were active for a few hours without activity and just now they returned to idle_c and are happy there. |
Hi, I have 4 Seagate Exos X18 18TB drives (ST18000NM000J-2TV103) where the EPC timers are no longer working after a long SMART test.
I am able to transition power states manually and until I access the drives again they stay in that state.
Things already tried:
SN04
Below is the info of one of the drives:
The text was updated successfully, but these errors were encountered: