Open Hardware Distribution and Documentation Working Group: Getting better by tracing failures

This is the ninth post of the series of the Open Hardware Distribution and Documentation Working Group (which we’ve shortened to “DistDoc”). The group aims to produce a proof of concept for distributed open science hardware (OScH) manufacturing, exploring key aspects like quality, documentation, business models and more using as a starting point a paradigmatic case study. We hope the experience motivates others to discuss and implement new strategies for OScH expansion.

Our previous post detailed the reasoning the team followed regarding the next steps for quality analysis and quality control. We will pick up here by showing the main output of that work, the failure mode analysis spreadsheet. As a reminder, a failure mode analysis details every possible way in which a product fails: which component or subsystem is failing? What is the cause, and what is the effect? Once identified, these failures provide a clear path to making a better product.

So what does the failure mode analysis for Open Flexure look like? First of all, failures are divided into those corresponding to manufacturing and those corresponding to use. Manufacturing failures slow down the manufacturing process but will not affect end users, unless they are not caught. Use failures emerge during the use, and may be related to a manufacturing failure that was missed. “Parts have curled edges” is an example of manufacturing failure, while “Motor wires getting tangled” is associated with the use of the microscope.

General view of the team’s proto-failure mode analysis

When identified, the failures are also rated on a scale of 1–5; for manufacturing failure this means the chance of failure before acceptable time (1–5 scale). For use failures, the scale shows the severity: 1 means lost time, 2–3 minor maintenance required, 4 major maintenance required, 5 data loss/sample damage. Severity indicates if a simple mitigation is possible or if a redesign is needed. The use failure sheet also includes columns to identify if it affects the educational microscope, the professional-grade version, or both.

Each failure has a detailed description that shows which component is failing, acceptable failure time, and possible reasons. However the most interesting part is what comes next: how do we fix this failure? Annotations show the relation between the failure and the general maintenance schedule, other mitigations in place, a list of mitigations needed, the chance of failure happening before acceptable time and the severity of failure. This basically allows the team to prioritize and plan considering the urgent mitigation work needing to be done. The last columns are reserved for testing details and showing the results of mitigation efforts.

One important aspect to address is who is completing the spreadsheet. For instance, how do we get the complete failure range to happen in different contexts, such as people building and using the microscope in Tanzania? The team agreed that input from manufacturers here can be useful to show failures in context, and that the document should allow version control. Also, that a 100% online system can be a barrier in some locations, so a mixture of online/offline processes would be ideal.

Currently the spreadsheet is hosted in Google Docs, a temporary solution for identifying the main failures before moving on to a better tool that allows the team to host their data independently of Google, and produce more professional results for audit compliance. Ideally, this spreadsheet can inform the life of each microscope, with lead users providing feedback on what fails and when, so as to get accurate statistics. When users buy the product, they can check all the quality control marks that were done. But not only for users — the developer team can inform manufacturers of the best practices to ensure best quality of the product is achieved.

One of the key lessons is that failure mode analysis gets much easier if it’s done early, before having the final documentation in place: the mitigation measures unleash changes that will have to be documented. In this vein, we celebrated one of the new features implemented by part of the team in GitBuilding: the ability to produce figures/pictures of the microscope “in real time” as the documentation is modified, using GitLab actions. This should facilitate the process of having visual documentation updated.

Some members of the team will connect after the holidays to work on documentation issues that need to be tackled. Considering what a year 2020 has been (and it’s not finished yet!), we decided to take a break for the holidays and reconvene in mid-January. Happy holidays everyone, and see you next year 🙂

Leave a Reply Cancel reply