Hardware failure database

From PrgmrWiki

requirements for the hardware failure database:


track hardware failures

We want to know what hardware we have where.

we want to figure out what make/model/type of hardware fails often.

If we take hardware from dead servers and put them in live servers, we want to keep track of hardware that is involved in more than it's fair share of problems, as a statistical means of finding hardware likely to be bad even if we have no way of confirming that it's bad.

we want to keep track of servers that are flakier than other servers (so the database needs to be able to keep track of problems we are not sure are related to hardware)

this system needs to handle servers, server parts (so it needs to have a hirarchical thing) network and terminal appliances (and should cables be sub-parts of those?)


lifftchi: i would put 'interface with the ticketing system' on the wishlist.

- 20:03 - lifftchi: if rt supports tags, tag all hw affected by a ticket with something like hwdb:id:nnnn.

ryan5141: So we have parts combinations in various servers and we want a database system which can keep track of which servers have which parts and which part/servers combinations maximize the frequency of problems?