Discussion:
POSKeyError for one device
FCharlier
2013-04-23 08:51:38 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72935#72935

--------------------------------------------------------------
Hi all,

For an unknown reason, it seems that we have hit logical corruption in zope database. When we try to access one specific device in Zenoss web interface, no information about the device appears on the page and a yellow banner pop up with following error message "POSKeyError:0x049397".

After some researches, I found http://wiki.zenoss.org/Fixing_ZODB_Integrity_Issues http://wiki.zenoss.org/Fixing_ZODB_Integrity_Issues page. I have already try the three solutions without success.

Last outout of zenoss-repairdevice.py script:

Checking censored.hostname.com
ERROR:ZODB.Connection:Couldn't load state for 0x04939f
Traceback (most recent call last):
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
POSKeyError: 0x04939f
Traceback (most recent call last):
  File "./zenoss-repairdevice.py", line 93, in <module>
    processOneDevice(dev)
  File "/opt/zenoss/lib/python/ZODB/transact.py", line 44, in g
    r = f(*args, **kwargs)
  File "./zenoss-repairdevice.py", line 78, in processOneDevice
    dev.checkRelations(repair=True)
  File "/opt/zenoss/Products/ZenRelations/RelationshipManager.py", line 257, in checkRelations
    rel.checkRelation(repair)
  File "/opt/zenoss/Products/ZenRelations/ToManyRelationship.py", line 286, in checkRelation
    if len(self._objects):
  File "/opt/zenoss/lib/python2.7/UserList.py", line 30, in __len__
    def __len__(self): return len(self.data)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
ZODB.POSException.POSKeyError: 0x04939f

Any help appreciated :-)
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72935#72935]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Ryan Matte
2013-04-23 15:01:33 UTC
Permalink
Ryan Matte [http://community.zenoss.org/people/rmatte] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72947#72947

--------------------------------------------------------------
Issues such as this can be caused by corruption in memcached.  Stop Zenoss, restart memcached (service memcached restart) then start Zenoss back up and try running the repair again.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72947#72947]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-04-23 18:47:28 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72949#72949

--------------------------------------------------------------
Tried this solution but repair script stopped with exactly same error :(
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72949#72949]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Ryan Matte
2013-04-23 18:52:18 UTC
Permalink
Ryan Matte [http://community.zenoss.org/people/rmatte] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72939#72939

--------------------------------------------------------------
Apparently POSKeyErrors are very tricky to get rid of and could cause data loss if you're not careful doing it...

http://plonechix.blogspot.ca/2009/12/definitive-guide-to-poskeyerror.html http://plonechix.blogspot.ca/2009/12/definitive-guide-to-poskeyerror.html
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72939#72939]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Ryan Matte
2013-04-23 18:53:01 UTC
Permalink
Ryan Matte [http://community.zenoss.org/people/rmatte] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72956#72956

--------------------------------------------------------------
That link I pasted references Data.fs, but since Zenoss now uses relstorage to store the zope db in MySQL I'm not sure if you can use the scripts that they reference on that page or if you need to do it a different way.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72956#72956]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-04-24 14:24:59 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72958#72958

--------------------------------------------------------------
Indeed, Zenoss seems to use relstorage so I didn't any Data.fs file. As history is not important for this device, I try to delete it but I hit same POSKeyError ...
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72958#72958]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-04-25 15:23:49 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72986#72986

--------------------------------------------------------------
Anybody have a method to force delete a corrupted device?
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72986#72986]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Shane Scott
2013-04-25 18:55:35 UTC
Permalink
Shane Scott [http://community.zenoss.org/people/hackman238] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72988#72988

--------------------------------------------------------------
FCharlier:

Yikes. For the heck of it, d=dmd.Devices.findDevice('problemDevice') then d.deleteDevice()

LEt me know what traceback it produces.

--Shane Scott (Hackman238)
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72988#72988]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-04-26 07:21:50 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/72989#72989

--------------------------------------------------------------
Here you are:

[***@zenoss ~]$ zendmd
Welcome to the Zenoss dmd command shell!
'dmd' is bound to the DataRoot. 'zhelp()' to get a list of commands.
Use TAB-TAB to see a list of zendmd related commands.
Tab completion also works for objects -- hit tab after an object name and '.'
(eg dmd. + tab-key).
d=find('censored.hostname.com')
d.deleteDevice()
2013-04-26 09:14:49 ERROR zen.Relations Remote remove failed. Run "zenchkrels -r -x1". /opt/zenoss/log/tracebacks/tmp5BmWVX.txt contains the description of this error.
2013-04-26 09:14:49 ERROR ZODB.Connection Couldn't load state for 0x049398
Traceback (most recent call last):
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
POSKeyError: 0x049398
2013-04-26 09:14:49 ERROR ZODB.Connection Couldn't load state for 0x049398
Traceback (most recent call last):
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
POSKeyError: 0x049398
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/opt/zenoss/Products/ZenModel/Device.py", line 1825, in deleteDevice
    parent._delObject(self.getId())
  File "/opt/zenoss/Products/ZenRelations/ToManyRelationshipBase.py", line 87, in _delObject
    self.removeRelation(obj, suppress_events)
  File "/opt/zenoss/Products/ZenRelations/RelationshipBase.py", line 107, in removeRelation
    self._remove(obj, suppress_events=suppress_events)
  File "/opt/zenoss/Products/ZenRelations/ToManyContRelationship.py", line 166, in _remove
    notify(ObjectWillBeRemovedEvent(robj, self, robj.getId()))
  File "/opt/zenoss/lib/python/zope/event/__init__.py", line 31, in notify
    subscriber(event)
  File "/opt/zenoss/lib/python/zope/component/event.py", line 24, in dispatch
    zope.component.subscribers(event, None)
  File "/opt/zenoss/lib/python/zope/component/_api.py", line 136, in subscribers
    return sitemanager.subscribers(objects, interface)
  File "/opt/zenoss/lib/python/zope/interface/registry.py", line 336, in subscribers
    return self.adapters.subscribers(objects, provided)
  File "/opt/zenoss/lib/python/zope/interface/adapter.py", line 583, in subscribers
    subscription(*objects)
  File "/opt/zenoss/lib/python/zope/component/event.py", line 32, in objectEventNotify
    zope.component.subscribers((event.object, event), None)
  File "/opt/zenoss/lib/python/zope/component/_api.py", line 136, in subscribers
    return sitemanager.subscribers(objects, interface)
  File "/opt/zenoss/lib/python/zope/interface/registry.py", line 336, in subscribers
    return self.adapters.subscribers(objects, provided)
  File "/opt/zenoss/lib/python/zope/interface/adapter.py", line 583, in subscribers
    subscription(*objects)
  File "/opt/zenoss/lib/python/OFS/subscribers.py", line 101, in dispatchObjectWillBeMovedEvent
    dispatchToSublocations(ob, event)
  File "/opt/zenoss/lib/python/zope/container/contained.py", line 151, in dispatchToSublocations
    for ignored in zope.component.subscribers((sub, event), None):
  File "/opt/zenoss/lib/python/zope/component/_api.py", line 136, in subscribers
    return sitemanager.subscribers(objects, interface)
  File "/opt/zenoss/lib/python/zope/interface/registry.py", line 336, in subscribers
    return self.adapters.subscribers(objects, provided)
  File "/opt/zenoss/lib/python/zope/interface/adapter.py", line 583, in subscribers
    subscription(*objects)
  File "/opt/zenoss/lib/python/zope/container/contained.py", line 150, in dispatchToSublocations
    for sub in subs.sublocations():
  File "/opt/zenoss/Products/ZenRelations/ToManyContRelationship.py", line 340, in sublocations
    return (ob for ob in self.container.objectValuesAll())
  File "/opt/zenoss/Products/ZenRelations/ToManyContRelationship.py", line 230, in objectValues
    return self._safeOfObjects()
  File "/opt/zenoss/Products/ZenRelations/ToManyContRelationship.py", line 78, in _safeOfObjects
    for ob in self._objects.values():
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
POSKeyError: 0x049398
Content of /opt/zenoss/log/tracebacks/tmp5BmWVX.txt available here: http://pastebin.com/y98fPQUK http://pastebin.com/y98fPQUK
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/72989#72989]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
XiangJun Wu
2013-05-14 07:41:54 UTC
Permalink
XiangJun Wu [http://community.zenoss.org/people/goafter1981] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73165#73165

--------------------------------------------------------------
Hi,

I run into the same issue.
I can not remove device from zendmd.
***@ip-10-134-131-25 ~]$ ./zenoss-repairdevice.py


Checking P-TPU-176.34.146.253
ERROR:ZODB.Connection:Couldn't load state for 0x0cd469
Traceback (most recent call last):
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
POSKeyError: 0x0cd469
Traceback (most recent call last):
  File "./zenoss-repairdevice.py", line 97, in <module>
    processOneDevice(d)
  File "/opt/zenoss/lib/python/ZODB/transact.py", line 44, in g
    r = f(*args, **kwargs)
  File "./zenoss-repairdevice.py", line 78, in processOneDevice
    dev.checkRelations(repair=True)
  File "/opt/zenoss/Products/ZenRelations/RelationshipManager.py", line 257, in checkRelations
    rel.checkRelation(repair)
  File "/opt/zenoss/Products/ZenRelations/ToManyRelationship.py", line 286, in checkRelation
    if len(self._objects):
  File "/opt/zenoss/lib/python2.7/UserList.py", line 30, in __len__
    def __len__(self): return len(self.data)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 860, in setstate
    self._setstate(obj)
  File "/opt/zenoss/lib/python/ZODB/Connection.py", line 901, in _setstate
    p, serial = self._storage.load(obj._p_oid, '')
  File "/opt/zenoss/lib/python/relstorage/storage.py", line 476, in load
    raise POSKeyError(oid)
ZODB.POSException.POSKeyError: 0x0cd469

Any comments?
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73165#73165]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-05-14 11:36:50 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73198#73198

--------------------------------------------------------------
Still same problem here, no solution found :(
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73198#73198]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
XiangJun Wu
2013-05-14 12:10:40 UTC
Permalink
XiangJun Wu [http://community.zenoss.org/people/goafter1981] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73199#73199

--------------------------------------------------------------
Is there a way to workaround it?
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73199#73199]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Shane Scott
2013-05-14 15:14:12 UTC
Permalink
Shane Scott [http://community.zenoss.org/people/hackman238] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73203#73203

--------------------------------------------------------------
All,

This is ZODB corruption. More times than not it's extremely hard to fix. The best option would be to dump your config and setup a fresh instance or roll back to a backup.
--Shane Scott (Hackman238)
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73203#73203]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
XiangJun Wu
2013-05-14 23:06:41 UTC
Permalink
XiangJun Wu [http://community.zenoss.org/people/goafter1981] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73192#73192

--------------------------------------------------------------
It is critical system for us.We want to keep metrics history there.Can you show me the steps to dump config and restore to fix broke Zodb. It will be highly appreciated
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73192#73192]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Shane Scott
2013-05-15 12:47:19 UTC
Permalink
Shane Scott [http://community.zenoss.org/people/hackman238] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73220#73220

--------------------------------------------------------------
Xiang,

No worries. How many devices and what sort of custom work have you loaded into your Zenoss? Your performance data isn't stored in the ZODB, so that will be uneffected. You'll want to backup $ZENHOME/perf as this is where all the RRD data lives. If your setup is fairly stock I'd almost suggest you setup a second server running the same platform, reinstall zenoss and your zenpacks then rsync over $ZENHOME/perf. Once you do that you can systematically add your devices back in to Zenoss. As long as the devices are added using the same ID as before they'll automatically point to their correct performance data directory under $ZENHOME/perf.

If you need to dump your ZODB to rebuild I generally reccommend writting a short zendmd script that outputs every devices id, title, deviceClass, groups, systems and manageIp into a CSV or XML doc which you can then read with a simple python script to add devices back in. You can also use batchloader, though I'm not 100% positive it works in v4, I can't claim to have tested it recently.

--Shane Scott (Hackman238)
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73220#73220]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Matt Jenkins
2013-05-20 19:08:32 UTC
Permalink
Matt Jenkins [http://community.zenoss.org/people/smartermatt] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73287#73287

--------------------------------------------------------------
I have over 1200 devices in zenoss and am also getting this error. Is there any automated way to press backup and restore? Every device has customer configuration parameters for snmp or mysql, etc.

I would think there would be a fix script available somewhere....
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73287#73287]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Shane Scott
2013-05-20 20:56:42 UTC
Permalink
Shane Scott [http://community.zenoss.org/people/hackman238] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73289#73289

--------------------------------------------------------------
Matt,

It's a very low level error whose source could be hard to track and even harder to solve. Unfortunately any automated process would potentially carry over the problem (ex zenbackup/zenrestore). Your best bet would be to rollback to your last backup before the issue or get support or a consultant to dig into the problem interactively or to migrate you off to a fresh platform.

It's something I can help you with, but really only interactively.

Best,
--Shane Scott (hackman238)
http://shanewilliamscott.com http://shanewilliamscott.com
http://linkedin.com/in/shanewilliamscott http://linkedin.com/in/shanewilliamscott
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73289#73289]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
jmp242
2013-05-30 14:10:34 UTC
Permalink
jmp242 [http://community.zenoss.org/people/jmp242] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73387#73387

--------------------------------------------------------------
Was this Zenoss install initially 4.2.0? Or originally 4.2.3 . . .

--
James Pulver
ZCA Member
LEPP Computer Group
Cornell University
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73387#73387]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-05-30 14:15:09 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73389#73389

--------------------------------------------------------------
In my case, it's a 4.2.0 upgraded to 4.2.3.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73389#73389]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
XiangJun Wu
2013-05-31 09:53:34 UTC
Permalink
XiangJun Wu [http://community.zenoss.org/people/goafter1981] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73415#73415

--------------------------------------------------------------
I've fixed the issue via recovering backup.
Thank you!!
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73415#73415]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
jmp242
2013-05-31 12:44:02 UTC
Permalink
jmp242 [http://community.zenoss.org/people/jmp242] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/73418#73418

--------------------------------------------------------------
We're sorry about this, the research I've done implies this is a corruption caused by MySQL exiting mid transaction, thereby corrupting objects in the relstorage ZopeDB. This is an issue if, say, MySQL is killed by the OOM killer - so sizing the Zenoss server appropriately is critical. The corruption can linger unnoticed for some time if you don't access the corrupted object, but you will see errors in some logs. Repair is an involved manual process, requiring a Zenoss Guru.

This is why ZCA best practices are to backup early and backup often. Restoring a backup is much quicker (and cheaper if you were to hire a consultant).

I have been informed that 4.2.3 (and 4.2.4) include many fixes to try and prevent this sort of corruption from happening in the future, but it is important to not have MySQL drop a transaction due to a "kill"...

--
James Pulver
ZCA Member
CLASSE Computer Group
Cornell University
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/73418#73418]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-07-24 09:26:56 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/74104#74104

--------------------------------------------------------------
I tried this solution but the original problem is still there on the new instance! It seems to me that we are definitely stuck with this problem :-(
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74104#74104]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
j053ph4
2013-10-07 15:28:48 UTC
Permalink
j053ph4 [http://community.zenoss.org/people/j053ph4] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/74854#74854

--------------------------------------------------------------
Hi All,

I just ran into this error myself...I had one bad device that was throwing several different POSKeyError messages, but I was able to fix it (I think) using the following methods in the dmd. No guarantee it will work but i can load the device status page again, so that's at least an improvement.

All commands are in zendmd:

First, i set variables for the bad device and a good one for comparison:

*a = dmd.Devices.findDevice('baddevicename')*
*b = dmd.Devices.findDevice('goodevicename')*

# try the following:
*for x in a.objectValues():  print x*
# hopefully you will see the error(s) at this point, if so, then:
*for x in b.objectValues():  print x*

# you can then compare the output of both and try to see which attributes are causing the error.  In my case, the bad attributes were "maintenanceWindows', 'adminRoles', userCommands', and 'componentSearch'.  So then for each one I ran :

*delattr(d, 'attributename')*

# after that I ran the following snippet to recreate the relations:

*for name, schema in d._relations:*
*    try:*
*        print "setting %s",name*
*        d._setObject(name, schema.createRelation(name))*
*    except:*
*        pass*

#and after that the "componentSearch" attribute was still missing.  Reviewing the code under Products/ZenModel/Device.py, it seemed that it should be recreated when calling "getDeviceComponents()", so i ran:

*a.getDeviceComponents()*

which recreated the attribute.  This indicates that there could be other attributes that require different methods to recreate, but the pattern seems to be:

     1) find the offending attribute
     2) delete the offending attribute
     3) recreate the attribute

This seems to have cleared up the issues.  Hopefully this will help someone, but please be careful as I'm not certain this method won't cause problems itself.

Thanks,
Joseph
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74854#74854]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
FCharlier
2013-10-08 08:24:34 UTC
Permalink
FCharlier [http://community.zenoss.org/people/FCharlier] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/74864#74864

--------------------------------------------------------------
Hi Joseph,

Thanks  for this workaround!
Post by j053ph4
b = dmd.Devices.findDevice('goodevicename')
for x in b.objectValues():  print x
...
<ToManyRelationship at dependencies>
<ToManyRelationship at dependents>
<ToOneRelationship at deviceClass>
<ToOneRelationship at perfServer>
<ToOneRelationship at location>
<ToManyRelationship at systems>
<ToManyRelationship at groups>
<ToManyContRelationship at maintenanceWindows>
<ToManyContRelationship at adminRoles>
<ToManyContRelationship at userCommands>
<ToManyRelationship at monitors>
<ToManyContRelationship at openvz_containers>
<OperatingSystem at os>
<DeviceHW at hw>
<ZCatalog at componentSearch>
<ToManyContRelationship at msmqqueues>
So, I can't do comparison between attribute linked to bad device and good one.

Regards,

Fabrice
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74864#74864]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
j053ph4
2013-10-15 15:30:24 UTC
Permalink
j053ph4 [http://community.zenoss.org/people/j053ph4] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/74965#74965

--------------------------------------------------------------
Well, I ran into this again for some reason, so here's a little function that can be copy/pasted into zendmd and run against an offending device object.  It basically does what was mentioned above:

*def fixPOSKeyError(d):*
*    print "testing %s" % d.id*
*    for k,v in d.objectItems(): *
*        try:*
*            print k,v.remoteName(), v.remoteClass()*
*        except:*
*            print "FIXING %s" % k*
*            delattr(d, k)*

*    print "rebuilding relations for %s" % d.id*
*    for name, schema in d._relations:*
*        try:*
*            print "setting %s",name*
*            d._setObject(name, schema.createRelation(name))*
*        except:*
*            pass*
*    print "rebuilding component index"*
*    d.buildRelations()*
*    d.checkRelations(repair=True)*
*    d.os.buildRelations()*
*    d.os.checkRelations(repair=True)*
*    print "%s has %s components" % (d.id, len(d.getDeviceComponents()))*

Hope this helps,
Joseph
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/74965#74965]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
cbuskirk
2013-10-25 19:32:56 UTC
Permalink
cbuskirk [http://community.zenoss.org/people/cbuskirk] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75095#75095

--------------------------------------------------------------
fixPOSKeyError('InfinityWeb24')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "<console>", line 2, in fixPOSKeyError
AttributeError: 'str' object has no attribute 'id'
Any ideas?
Thanks,
Chris
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75095#75095]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Ryan Matte
2013-10-25 19:34:24 UTC
Permalink
Ryan Matte [http://community.zenoss.org/people/rmatte] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75096#75096

--------------------------------------------------------------
That's because you're running it wrong.

d = find('yourdevicename')
fixPOSKeyError(d)
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75096#75096]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
rgartley
2013-11-15 17:16:51 UTC
Permalink
rgartley [http://community.zenoss.org/people/rgartley] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75269#75269

--------------------------------------------------------------
Ryan,

I have a stupid question for you. I'm trying to create that little function you have there and I'm having an issue. When running the *def fixPOSKeyError(d):* command it obiously goes into some edit mode. Once the rest of the code is pasted in, how does one save and close it? I'm not the zendmd guru here but I'd like to work this issue.

Thanks in advance
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75269#75269]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Ryan Matte
2013-11-15 17:19:01 UTC
Permalink
Ryan Matte [http://community.zenoss.org/people/rmatte] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75270#75270

--------------------------------------------------------------
It's not an "edit mode", it's an interactive python prompt.  You paste the function code in, then once you get back to the prompt you just do what I described at the prompt and it'll run.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75270#75270]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
rgartley
2013-11-15 17:49:37 UTC
Permalink
rgartley [http://community.zenoss.org/people/rgartley] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75271#75271

--------------------------------------------------------------
Thanks for the quick response and the correction on the interactive python prompt.... still... not being a python guy here, I'm unable to run that. See my output below....

...     d.checkRelations(repair=True)
...     d.os.buildRelations()
...     d.os.checkRelations(repair=True)
...     print "%s has %s components" % (d.id, len(d.getDeviceComponents()))
... d = find('mydevice')
  File "<console>", line 23
    d = find('mydevice')
    ^
SyntaxError: invalid syntax
Post by j053ph4
fixPOSKeyError(d)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
NameError: name 'fixPOSKeyError' is not defined
The first 4 lines are the last part of the cut and paste of your code. Notice how it's prompt is ... 

When I then run the *d = find('yourdevicename')* command right after that's where the syntax error happens and then it returns to the >>> prompt.

So what happens is....
1. I enter zendmd and am presented with the >>> prompt
2. I enter the first line of your function *def fixPOSKeyError(d):* and them presented with the ... prompt
3. I enter the code to the function and when complete am still at the ... prompt.
4. At the ... prompt I enter *d = find('mydevice')* and get a syntax error.

After entering your function and ending at the ... prompt, how do I get back to >>> prompt to run the
*def fixPOSKeyError(d):*  command? End? Quit? Close?

I hope the makes sense and I apologize for the ignorance. Just looking for a fix to this error and yours seems to be right on track.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75271#75271]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Ryan Matte
2013-11-18 13:29:40 UTC
Permalink
Ryan Matte [http://community.zenoss.org/people/rmatte] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75276#75276

--------------------------------------------------------------
Try pasting the definition code in line by line instead.  Pasting large blocks of code in to zendmd doesn't work so well sometimes.  Once all that code is in, hit enter twice, then perform the steps that I explained previously to run it.
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75276#75276]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
jmp242
2013-11-18 14:54:43 UTC
Permalink
jmp242 [http://community.zenoss.org/people/jmp242] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75277#75277

--------------------------------------------------------------
This may be helpful - how to use zendmd stuff:
https://zcaportal.org/wiki/bin/view/ZCA/ZCAFAQ#Someone_gave_me_a_zendmd_script_45_what_do_I_do_with_it_63 https://zcaportal.org/wiki/bin/view/ZCA/ZCAFAQ#Someone_gave_me_a_zendmd_script_45_what_do_I_do_with_it_63

--
James Pulver
ZCA Member
CLASSE Computer Group
Cornell University
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75277#75277]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
bknotts
2013-11-19 22:54:15 UTC
Permalink
bknotts [http://community.zenoss.org/people/bknotts] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75296#75296

--------------------------------------------------------------
Much thanks for your work on this

This is the (sanitized) output I get when I run the function:

testing devicename.domain.tld
dependencies dependents <class 'Products.ZenModel.ManagedEntity.ManagedEntity'>
dependents dependencies <class 'Products.ZenModel.ManagedEntity.ManagedEntity'>
deviceClass devices <class 'Products.ZenModel.DeviceClass.DeviceClass'>
perfServer devices <class 'Products.ZenModel.PerformanceConf.PerformanceConf'>
location devices <class 'Products.ZenModel.Location.Location'>
systems devices <class 'Products.ZenModel.System.System'>
groups devices <class 'Products.ZenModel.DeviceGroup.DeviceGroup'>
maintenanceWindows productionState <class 'Products.ZenModel.MaintenanceWindow.MaintenanceWindow'>
adminRoles managedObject <class 'Products.ZenModel.AdministrativeRole.AdministrativeRole'>
userCommands commandable <class 'Products.ZenModel.UserCommand.UserCommand'>
monitors devices <class 'Products.ZenModel.StatusMonitorConf.StatusMonitorConf'>
os deviceClass <class 'Products.ZenModel.Device.Device'>
hw deviceClass <class 'Products.ZenModel.Device.Device'>
componentSearch deviceClass <class 'Products.ZenModel.Device.Device'>
msmqqueues msmqserver <class 'ZenPacks.zenoss.MSMQMonitor.MSMQQueue.MSMQQueue'>
rebuilding relations for devicename.domain.tld
setting %s dependencies
setting %s dependents
setting %s deviceClass
setting %s perfServer
setting %s location
setting %s systems
setting %s groups
setting %s maintenanceWindows
setting %s adminRoles
setting %s userCommands
setting %s monitors
setting %s msmqqueues
rebuilding component index
devicename.domain.tld has 231 components

Looks like something is not working as it is supposed to.  Nothing seems to get fixed and the POSKeyError persists.  What am I doing wrong?
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75296#75296]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
j053ph4
2013-11-25 18:02:38 UTC
Permalink
j053ph4 [http://community.zenoss.org/people/j053ph4] created the discussion

"Re: POSKeyError for one device"

To view the discussion, visit: http://community.zenoss.org/message/75330#75330

--------------------------------------------------------------
I found an "unintended consequence of the script i posted".  It apparently breaks the device side of the systems, groups, and locations relationships on the device (but not the organizer side).

The following method will fix the issue (copy/paste into dmd as before):

*def fixOrganizerMembership():*
*    for o in dmd.Systems.getSubOrganizers():*
*        for d in o.devices():*
*            if o.getDmdKey() not in d.getSystemNames(): d.systems._add(o)*
*    for o in dmd.Groups.getSubOrganizers():*
*        for d in o.devices():*
*            if o.getDmdKey() not in d.getDeviceGroupNames(): d.groups._add(o)*
*    for o in dmd.Locations.getSubOrganizers():*
*        for d in o.devices():*
*            if o.getDmdKey() not in d.getLocationName(): d.location._add(o)*

Hope this helps,
Joseph
--------------------------------------------------------------

Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/75330#75330]

Start a new discussion in zenoss-users at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
Loading...