Q: Is it true that oximeters are not reliable below 70% SpO2?

Not necessarily. This commonly recited myth stems from the fact that oximeters are only tested in clinical validation studies down to 70% SpO2 due to the stated requirement by the FDA and also because of limitations by institutional review boards. So, while oximeters are not usually validated clinically below SpO2 70%, and oximeter performance becomes less accurate at lower SpO2, oximeter readings below 70% should not be disregarded. Widely inaccurate SpO2 values below ~SpO2 70% are more often due to poor signal in the context of a critically ill patient with poor perfusion, rather than an inherent limitation in inherent oximeter technology. Keywords: accuracy, inaccurate

Q: What does the CE marking mean?

The CE marking is a mark that manufacturers must obtain in order for their devices or products to be sold within the European Union (EU). Each company has a notified body that is paid by the company to ensure ISO compliance. The company must demonstrate to the notified body that the device meets the desired ISO standard. This marking signifies that the product is in compliance with European health, safety, and environmental protection standards. The CE marking may also be found on products sold outside of the EU that have been manufactured to these standards. The CE marking is required for medical devices, and having it confirms that the device meets essential requirements of the European General Medical Devices Directive and that it is safe. Of note, the CE marking is not a quality indicator or a certification mark, but it does indicate that the device complies with EU regulations and can legally be sold. Also, CE marking for pulse oximeters can sometimes be found on devices fraudulently or for reasons other than conformity with ISO-80601 (i.e. the ISO standard for pulse oximetry). Please note that our listing of devices having CE marking status is based on manufacturer product literature and does not necessarily indicate conformity with ISO-80601. References: European Commission CE Marking Keywords: CE, marking, European general medical devices directive, EEA

Q: What does the IP rating mean?

An IP rating, also known as the ingress protection rating, is the rating of a product’s ability to withstand liquid and dust. IP ratings were defined and developed by the International Electrotechnical Commission (IEC), and are recognized all over the world. The IP code is composed of two numbers: The first number rates the device’s protection against solid particles (i.e. dust). It is rated from 0 (no protection) to 6 (full protection). The second number rates the device’s protection against liquids. It is rated from 0 (no protection) to 9 (protection from high-pressure, high-temperature liquid). References: IEC: IP Ratings Keywords: IP, ingress protection, dust, liquid, IEC

Q: What is ISO and what standards exist for pulse oximeters?

The International Organization for Standardization (ISO) is a worldwide standard-setting body. It is composed of representatives from several national standards organizations and aims to provide common standards for technology, agriculture, healthcare, and other manufactured products. These standards are meant to ensure that products and services are safe and of good quality. ISO has standards for various healthcare-related devices and products, including pulse oximeters. ISO 80601-2-61:2017 describes the requirements for basic safety and essential performance of pulse oximeter equipment. This also includes standards for the pulse oximeter monitor, pulse oximeter probe, and probe cable extender. References: ISO: Pulse Oximeter Keywords: ISO, international organization for standardization

Q: What is the meaning of ‘substantially equivalent’ for FDA approval?

To receive FDA approval on a new device intended for human use, a premarket submission called a 510k must be made to the FDA. This must demonstrate that the device is substantially equivalent to a legally marketed device, meaning that it is just as safe and effective. According to the FDA, a device is considered “substantially equivalent” if the following criteria are met: Has the same intended use as the predicate device; and Has the same technological characteristics as the predicate; OR Has the same intended use as the predicate; and Has different technological characteristics and does not raise different questions of safety and effectiveness; and The device is demonstrated to be as safe and effective as the legally marketed device Once this information has been submitted to the FDA, the FDA will determine whether the device is safe and effective through a thorough review process, including evaluation of performance data and technological characteristics. In the US, a device may not be marketed until the FDA determines that substantial equivalence is present. References: FDA Premarket Notification 510(k) Keywords: substantially equivalent, SE, FDA, 510k

Question 1

Can pulse oximeter performance be validated by an in vitro device?

Accepted Answer

The short answer is no. There is no such thing as a pulse oximeter simulator and none can reliably predict clinical performance for all oximeters.
There are several devices that exist to ‘simulate’ or ‘test’ pulse oximeter function including the Fluke ProSim SPOT, Fluke ProSim8 and Whaleteq AECG100. 
These devices work by detecting LED pulses from an oximeter, then fabricating their own output signal to the oximeter sensor. The Fluke preserves any noise in the LED light signal while the others simply trigger from it and in doing so eliminate all potential errors due to LED noise (a very common problem with implications for oximeter performance). Some of these devices also have a limited range of simulated conditions (ie perfusion) and require prior calibration to the specific oximeter being tested. Thus, for oximeters already calibrated into the in vitro simulator, the simulator can help confirm that a device is working (or at least is sensing signal) as it was designed to do in the factory. For devices not previously calibrated, conclusions about performance are uncertain. 
Multiple studies report to have used these devices to ‘validate’ oximeter performance, yet performance of an uncalibrated oximeter on an in vitro simulator does not ensure good oximeter performance in reality. 
Simulators of this type are useful for determining the operating range of an oximeter.  For example the limits in terms of dark skin and low perfusion can be found.  One can determine if the device reads an erroneous value when pushed beyond its effective range or if it reports nothing.
The OpenOximetry.org Project is working to develop novel in vitro testing devices and protocols that overcome the limitations of prior devices. The hope is that newer techniques can better augment human study subject testing and predict device performance. 
 
Read here for more on oximeter performance validation requirements

Question 2

Do pulse oximeters require regular calibrations while in clinical use?

Accepted Answer

Most pulse oximeters do not need to be calibrated by the user before they will work properly. During device development, the microprocessors of pulse oximeters are calibrated using reference SaO2 measurement data that are compiled from healthy volunteers. Volunteers are typically exposed to different levels of inspired oxygen to yield SaO2 ranging from around 75 to 100%. 
There are devices such as the Fluke ProSim 8 that are designed to check if select manufacturers’ oximeters are functioning, but are not designed to validate accuracy.
References: Lifebox Pulse Oximetry Learning Module
Keywords: calibration

Question 3

How accurate are pulse oximeters?

Accepted Answer

For oximeters that have been approved by the US FDA, accuracy over the SpO2 range of 70-100% is within a few SpO2 percent of the actual value when tested under laboratory validation conditions (for some oximeters, performance may be different when used in various clinical conditions). Most oximeters tend to be more accurate with higher SpO2 values than lower SpO2 values. 
The FDA has published guidelines for the recommended standards for pulse oximeter performance, which includes device accuracy. These guidelines require the devices to be tested with a minimum of 200 data points over an SaO2 range of 70% to 100%, and to be tested on people with different skin tones. Studies have shown that many low-cost pulse oximeters demonstrate highly inaccurate readings. However, some low-cost pulse oximeters have performed with similar accuracy to more expensive units when used in healthy subjects. However, it is important to recognize that many additional variables can make even good pulse oximeters have less accurate readings.
References: Lifebox Pulse Oximetry Learning Module; FDA Guidelines; Lipnick et al, Anesth Analg 2016
Keywords: accuracy, low-cost, guidelines, FDA

Question 4

How is pulse oximeter clinical validation done for FDA 510k clearance and ISO CE marking?

Accepted Answer

For FDA or ISO clearance, pulse oximeters must undergo testing in healthy human study subjects. 
ISO 80601-2-61 defines the test requirements, and the FDA has published guidance that refers to this standard. The standard requires the devices to be tested with a minimum of 200 data points (paired observations: pulse oximeter, co-oximeter) evenly distributed over an SaO2 range of 70% to 100%. The standard requires that devices be tested on “10 or more healthy subjects that vary in age and gender&#8221; and that the study has subjects with “a range of skin pigmentations, including at least 2 darkly pigmented subjects or 15% of the subject pool, whichever is larger.”
Briefly, study subjects are semi-supine (30° head up) with a nose clip (to prevent breathing through the nose), breathing controlled mixtures of air-nitrogen-carbon dioxide via a mouthpiece from a partial rebreathing circuit with a voluntarily increased minute ventilation (sometimes coached with a metronome) and 10 to 20 L/min total fresh gas flow into the circuit. A 22-g radial artery catheter is placed to sample arterial blood for the measurement of SaO2. 
The gas mixture (e.g. nitrogen) of the circuit is manually adjusted to achieve a series of 10 to 12 stable SaO2 plateaus between 70% and 100% (approximately 70%, 73%, 76%, 80%, 83%, 86%, 89%, 92%, 95%, 98%, and 100%). Carbon dioxide is manually increased into the circuit to prevent hypocapnia. At each plateau (after 30-60 seconds of stability) an arterial sample is drawn and immediately “functional” arterial Sao2 (HbO2/[Hb+HbO2]) is determined by multi-wavelength oximetry (e.g. an ABL-90, OSM3 or similar device). After an additional 30 seconds of stability another sample is drawn. These SaO2 values are recorded and used to compare with recorded simultaneous SpO2 values from the device being tested.  During the tests, subjects’ extremities may be placed under a warmer to promote good perfusion. 
References: FDA Guidelines 
Keywords: FDA, guideline, clinical, accuracy, 510k, approval, ISO

Question 5

How is skin color measured for the purposes of oximeter performance validation?

Accepted Answer

The short answer is that there is no standardized approach.
According to FDA regulations, manufacturers of pulse oximeters should validate device performance on “a range of skin pigmentations, including at least 2 darkly pigmented subjects or 15% of the subject pool, whichever is larger.” While ‘darkly pigmented’ is not formally defined, it is routinely defined by testing labs as Fitzpatrick skin phototype groups V-VI. The Fitzpatrick scale is a widely used method for classifying skin pigment, from I (pale white) to VI (darkest brown). However, it was originally developed for skin photosensitivity typing, which is not the same as skin color, and should not be conflated with race or ethnicity. Studies have shown inaccuracies in self-reported values, especially in darker skin types. There are other similar visual skin color classification systems, but as with the Fitzpatrick scale, they are all subjective. 
In early 2021, in response to Sjoding et al’s findings of racial (note, not necessarily skin color) bias in pulse oximetry measurement, the FDA issued a statement encouraging further investigation on pulse oximeter accuracy in darker skin types. Given the limitations of the Fitzpatrick scale, there is a need to adopt a standardized objective method for measuring skin color that provides reliable, quantitative, and easily interpretable results. For more discussion on new ways to address racial inequity in oximetry, check out this paper (November 2021).
There are several commonly used methods for determining &#8216;skin color.&#8217; Not all are used for the purposes of pulse oximeter validation cohort diversity. Below we discuss some of these methods including information on a few commonly encountered colorimeters on the market, including Mexameter, Colorimeter, and DermaCatch.
Below is a table summarizing several traditional classification systems for assessing skin color (not quantitative):
Reference: Ware et al, Racial Limitations of Fitzpatrick skin type, Cutis, 2020
 
Fitzpatrick Scale 
The Fitzpatrick skin type (FST) was developed in 1975 as a tool to assess the likelihood to burn during phototherapy treatments and is commonly used. The original scale was I-IV and it was only later that more diverse skin types were included. 
References: Image: CC BY 3.0 – D’Orazio et al, Int J Mol Sci, 2013; Eilers et al, Accuracy of Self-Report in assessing Fitzpatrick skin phototypes, JAMA Derm, 2013 ; Ware et al, Racial Limitations of Fitzpatrick skin type, Cutis, 2020 ; Wong et al, Analysis of discrepancies between pulse oximetry and arterial oxygen saturation measurements by race and ethnicity and association with organ dysfunction and mortality, JAMA, 2021
 
Von Luschan’s Chromatic Scale
The Von  Luschan’s Scale was developed as a tool to assess racial classifications according to skin color. It consists of 36 colored tiles which are compared to a person’s skin color.
This image is a reproduction of the Von Luschan scale which was originally created by Felix von Luschan and printed in Voelker, Rassen, Sprachen (1927). The original chart was scanned and the skin colors were copied using the paint program&#8217;s dropper tool.
Our team at Open Oximetry calculated the RBG values which are superimposed on the image of the scale. Notably, 13 and 14 were found to have the same RGB values.
References: Wikimedia Von Luschan’s Scale

Other ways to quantify skin pigment
Below we describe a few (of the many) products on the market to quantify skin pigmentation.
Mexameter® (MX 18) by Courage & Khazaka

Narrow-band reflectance colorimeter (only targets melanin and hemoglobin)
Output: melanin index and erythema index (arbitrary units 0-999)
Melanin index can be falsely affected by erythema
Requires base to be plugged in to display results (no computer needed)

Konica Minolta Chromameter

The CM-700d Spectrophotometer is portable, wireless, and lightweight
Allows for evaluation, reproduction, and control of pigment colors
Aperture switches between 3mm and 8mm sizes to evaluate small to large samples
Color spaces: L*a*b*, L*C*h, Hunter Lab, Yxy, XYZ, Munsell

For more information on studies conducted with the CM-700d Spectrophotometer, please view the following: 2022 study, 2020 study, and 2017 study.

Colorimeter® (CL 400) by Courage & Khazaka

Full visible spectrum reflectance colorimeter
Output: skin color RGB, L*a*b, xyz, ITA
Requires USB connection to Windows PC to display results

SkinColorCatch® (previously DermaCatch) by Delfin Technologies

Full visible spectrum reflectance colorimeter
Output: skin color L*a*b, ITA, L*c*h, RGB, melanin and erythema indices
Melanin and erythema indices insensitive to each other
Portable and battery operated

Other Skin pigment scales:
Color Bar Tool
The color bar tool was developed by dermatologists to determine the amount of melanin pigment in various skin types and assess an individual’s potential to get a sunburn. This was done by comparing the color tool against the sun protected skin of the upper inner arm, and is best used when people are not flushed from exercise or heat to avoid any excess redness that might distort the visual perception of color.
 
Munsell Color Chart
The Munsell color chart (also known as the Munsell system of color notation) was created to specify an objective assessment of color based on three properties of color: hue, value, and chroma. Hue refers to the basic color which is described as red (R), yellow (Y), green (G), blue (B), and purple (P) across a range of values. Value refers to the lightness or darkness of a color, which ranges from 10 (pure white) to 0 (pure black). Chroma describes the color intensity, which ranges from light pastel colors to deeply saturated colors. The Munsell chart was originally designed as the official color system for soil colors and to be used in agriculture, but has been adopted for use to describe and assess human skin tones. Multiple sets of color tiles are available for purchase and can be compared against an individual’s skin to determine the closest color match. 
Various studies have used the Munsell system for skin color assessment. Many are related to dermatologic treatments and use the Munsell chart to describe the colors of capillary vascular malformations, melasma, hyperpigmentation, and other skin discolorations. The most common Munsell color charts used to determine skin pigments were the 5YR and 7.5YR, though 2.5YR and 10YR were also frequently used.
References: Reeder et al, CEBP, 2014, Wright et al, Skin Research & Technology, 2015, Ries et al, CHEST, 1989, Foglia et al, J Pediatr, 2017, Konishi et al, J Dermatol, 2007, Wright et al, Skin Res Technol, 2016, McCreath et al, J Adv Nurs, 2016
 
Massey-Martin Scale
The Massey-Martin Scale (also known as the New Immigrant Survey or NIS Skin Color Scale) is an 11-point scale, ranging from zero to 10, which was created to assess skin color. Zero represents albinism, or the absence of color, and 10 represents the darkest possible skin tone. The shades of skin color from 1 to 10 are depicted on the scale as a hand which differs in color for each number. The Massey-Martin scale has been used in various publications, especially topics related to skin color and the economic mobility of immigrants. The scale is intended to be essentially memorized by interviewers so that they can determine respondent&#8217;s skin tone without the respondent seeing the chart. 
References: Hannon et al, Public Opin Q, 2016, Babool et al, Rheumatology, 2022, Massey NIS Skin Color Scale, 2003
 
Monk Skin Tone Scale
Additionally, some new color scales have been designed with the goal of improving skin tone evaluation in machine learning (ML) and artificial intelligence (AI), rather than for skin pigmentation quantification. The Skin Tone Research @ Google AI partnered with Harvard sociology professor Dr. Ellis Monk to open-source the Monk Skin Tone (MST) Scale, a continuous skin tone spectrum in the form of a 10-tone scale designed to represent a broader range of skin colors. Since AI can perpetuate unfair biases for people with darker skin pigmentation when it comes to camera technologies and image products, the MST is being utilized to evaluate datasets and ML models for better representation.

Interpreting quantitative results
Correlating quantitative measures to a visual color spectrum
Agache 2017 Measuring the Skin, Chapter 6 “The Measurement of Skin Color”:

Skin color comprises of melanin content, oxy/deoxy-hemoglobin content, and endogenous/exogenous pigments such as bilirubin and carotene
The CIE L*a*b color space is a device-independent reference for all the colors visible to the human eye and the most commonly used metric to quantify skin color
Individual Typology Angle (ITA) is calculated from L and b and objectively categorizes skin color into 6 different groups, from very light to dark

Chau et al 2019 Cutaneous Colorimetry as Gold Standard for Skin Color Measurement:

Explanation of CIE L*a*b color space
Explanation of how skin colorimeters and spectrophotometers work

Visscher 2017 Skin color and pigmentation in ethnic skin:

Correlates Fitzpatrick, ITA, Melanin Index

Wilkes et al 2015 Fitzpatrick Skin Type, Individual Typology Angle, and Melanin Index in an African Population: Steps Toward Universally Applicable Skin Photosensitivity Assessments:

Correlates Fitzpatrick, ITA, Melanin index

Keywords: skin pigment, Fitzpatrick, melanin, skin color, Colorimeter
Additional References: Okunlola et al, Pulse oximeter performance, racial inequity, and the work ahead, Resp Care, Nov 2021; FDA guidance on pulse oximeters; February 19, 2021 FDA statement “Pulse Oximeter Accuracy and Limitations”: addresses Sjoding’s findings, research limitations, and the need for further study; Sjoding et al, Racial Bias in Pulse Oximetry Measurement, NEJM, Dec 2020

Comparison Table of Quantitative Methods for Assessing Skin Color:

Konica Minolta CM 700d
·       Brochure
·       Manual
Courage + Khazaka CL 400
·       Brochure
Delfin SkinColorCatch

Device mechanism
Spectrophotometer
Tristimulus colorimetry with L*a*b color system
Visible spectrum reflectance colorimeter

Dimensions
·       Dimensions: 73 x 211.5 x 107mm
·       Aperture: 3mm or 8mm
·       Weight: 550g (without cap or batteries)
·       Dimensions: 13cm
·       Aperture: 5mm
·       Weight: 85g
·       Dimensions: 198 x 40 x 35mm
·       Aperture: ?
·       Weight: 145g (with batteries)

Illumination
·       Light source: pulsed xenon lamp with UV cut filter
·       Illuminated area: 6mm vs 11mm depending on aperture
·       Angle: 8* viewing angle, 2* or 10* observer angle
·       Wavelength range: 400-700nm
·       Light source: 8 LEDs arranged circularly
·       Illuminated area: ~17mm
·       Wavelength range: 440-670nm
·       Light source: 3 white LEDs arranged circularly
·       Illuminated area: 0.3cm^2
·       Angle:  45* to minimize gloss
·       RGB range: 25-246
Peak wavelengths: 620/540/460nm

Power
·        Special AC adapter
·       4 AA batteries
·       Plug into MPA device
·       2 AA batteries

Measurements
·       Eligible ISO 17025 certification
·       L*, a*, b*
·       L*, c*, h*
·       Yxy
·       XYZ
·       Munsell (Hue, Value, Chroma)
·       Color differences
·       XYZ
·       RGB
·       L*, a*, b* index values (do not fully correspond to ISO)
·       Calculates ITA
·       Melanin index, Erythema index (insensitive to each other)
·       L*, a*, b* index
·       Calculates ITA
·       L*, c*, h*
·       RGB

Reliability
·       Zero calibration & White calibration functions
·       Spectral reflectance: SD <0.1%
·       Chromaticity value: SD <∆E*ab 0.04
·       Inter-instrument agreement: <∆E*ab 0.2 (for 8mm aperture)
·       Short-term repeatability: higher with continuous measurements and larger aperture size
·       Calibration check function
·       Measurement error: +/- 5%
·       Interobserver reliability: ICC 0.79-0.97 moderate/good (Van der Wal et al 2013)
·       Calibration check function
·       Not affected by ambient light due to direct skin contact
·       Optical orifice designed to minimize pressure induced skin blanching

Data transfer
·       Mini USB cable
·       Bluetooth?
MPA device dependent:
·       Standalone display
·       Mini USB cable
·       Wireless
·       Bluetooth cable (wireless receiver unit)

Software
CM-SA Skin Analysis software
·       Brochure
·       Manual
(Windows PC)
MPA CTplus software
·       Brochure 1
·       Brochure 2
·       Includes RHT 400 for ambient temperature and humidity
MPA System software (old)
·       Brochure
(Windows PC)
Delfin Modular Core (DMC) software
·       Includes RoomSensor for ambient temperature and humidity
(Windows PC)

Data
·       Measurements above
·       Melanin index
·       Hb index
·       Hb SO2 index
·       Measurements above
·       Measurements above
·       Tabulated by Sites, Sessions, Subjects, and Instruments
·       Note: Automatically takes measurements in order of Session instead of Site which is less reliable (requires removing & reapplying probe at each site)

Graphics
·       Hue / Value graph
·       Hb index / Melanin index graph
Yes
Yes

Export to CSV
·       Data above
·       Spectral reflectance (400-700na)*2
·       Does NOT include ITA:
ITA = arctan((L*-50)/b*)
·       Does NOT include RGB:
Nix Color Converter
Yes
Yes
·       But exports each parameter as separate file
·       Can export data horizontally or vertically

Literature

·       C+K
·       Delfin

Table of Studies Comparing Quantitative Methods:
Systematic Review 2021

Study
Devices

Wang et al, 2018
CM-700d Spectrophotometer (Minolta)
PR-650 Telespectroradiometer (PhotoResearch

Globale Dermatologie
 
Meng et al, 2020
Chromameter (Minolta)
Mexameter (C+K)
DermaCatch (Colorix)

Matias et al, 2015
Mexameter MX-18 (C+K)
Colorimeter CL-400 (C+K)
Antera (Miravex)

Van der Wal et al, 2013
Mexameter MX-18 (C+K)
Colorimeter CL-400 (C+K)
DSM II ColorMeter

Measurement of Skin by Agache
Chromameter (Minolta)
Derma Spectrophotometer (Cortex)
Mexameter (C+K)
DermaCatch (Colorix)

Baquie & Kasraee, 2014
Mexameter MX-16 (C+K)
DermaCatch (Colorix)

Uter et al, 2013
Chromameter CR-300 (Minolta)
Reflektometer RM-100

Barel et al, 2001
Visi-Chroma VC-100
Chromameter CR-200 (Minolta)

Kerckhove, 2001
Chromameter CR-300 (Minolta)

Note: DermaCatch (Colorix) is now SkinColorCatch (Delphin)

Question 6

How to Interpret Open Oximetry Testing Results on the Device Data Page

Accepted Answer

When you click on a device to view its standard performance details, some of the terms can be unfamiliar. To help, we’ve put together this guide to explain some of these terms and concepts. 
NOTE: If you come across something that isn’t explained here, you can simply hover over the dark grey “i” button for a quick explanation. If you’re still unsure, feel free to contact us, and we’ll clarify!
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
How do we determine how accurate the oxygen saturation measured by the pulse oximeters is? 
Pulse oximeters estimate oxygen saturation (SpO2), which is the percentage of hemoglobin in the blood that is bound to oxygen. Pulse oximeters do this non-invasively by shining light through the skin. The most accurate way to measure oxygen saturation, however, involves taking a blood sample from an artery and analyzing it with a specialized device called a blood gas analyzer.
To assess the accuracy of pulse oximeters, we compare the oxygen saturation from the pulse oximeter to the oxygen saturation from the blood gas analyzer. We do this in healthy adults, in a controlled lab setting, by gradually and safely lowering their oxygen levels from a saturation of 100% down to 70%. Specialized statistical methods like ‘ARMS’ (described elsewhere in this FAQ) are then used to evaluate the pulse oximeter’s performance. For more information about our process, please refer to our study protocol.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
Can you explain what the manufacturer-claimed ARMS (Root Mean Square Error) for the SpO2 range of 70-100% refers to?
The ARMS can be a confusing term to describe pulse oximeter performance, but it is the most common metric used and is required by regulatory bodies. ARMS stands for accuracy root mean square. It tells you, on average, how close the device’s readings are to your true blood oxygen levels, when the oxygen levels are between 70% and 100%, based on tests in healthy adults in controlled laboratory studies.
It’s helpful to understand that a lower ARMS means the device readings are more accurate and precise. For example, with an ARMS of 2%, the device’s readings would typically be within approximately 2-4% of the true oxygen level. So, if your true oxygen saturation is 94%, then the pulse oximeter reading may be between 90% and 98%. 
Under the standard performance information for a particular device, there are two different ARMS values listed: 1) An ARMS that’s reported in the product manual from the manufacture (Manufacturer claimed Arms), 2) The Independent ARMS that we calculate from our testing here at the UCSF Hypoxia Lab using our study protocol. 
We follow current FDA guidance and ISO standards for testing and also add more elements to our testing protocols to better account for diversity of multiple factors like different skin colors and different levels of blood flow to the hands (factors that may impact performance of pulse oximeters).
For more information about ARMS, check out this detailed FAQ.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
What does the &#8220;Independent ARMS Study Cohort Size&#8221; mean?
This refers to how many healthy adults were tested using the OpenOximetry.org Protocol for this particular device. We share preliminary results after testing in at least 10 people (which is the minimum number required by the FDA guidance and ISO standards as of October 2024), but we continue testing the devices on as many different participants as possible to improve the diversity of people included. We are also waiting for updated guidelines to know the best number of participants for these tests.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
How is skin pigment defined?
Skin pigment refers specifically to the amount and type of melanin in your skin. People with darker skin have more melanin, while those with lighter skin have less.
We’re working hard to understand how skin pigment affects pulse oximetry readings. To learn more about how we measure skin pigment in our studies, check out our Skin Color Quantification page and our FAQ on skin pigment.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
Why might openoximetry.org results be different from what a manufacturer or other testing has reported?
While our testing follows FDA guidance and ISO standards for evaluating pulse oximeter performance, some aspects of our procedures may differ from those used in other testing labs. For instance, we test devices on a different group of people than those used by other labs, and device performance can vary between groups. Additionally, one way our testing may differ, while still adhering to ISO standards and FDA guidelines, is in how we modify participants&#8217; physiology. For example, in pulse oximeter testing on the fingers, it is common practice (and allowed under current regulations as of October 2024) to warm participants&#8217; hands to improve circulation and enhance the signal. This may result in better performance for some devices during warming than without warming. However, for the Open Oximetry Project, we do not routinely warm participants&#8217; hands during testing.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
Why might openoximetry.org results change over time?
Our results will continue to evolve as we continue testing devices in more people,  learning more about how to best measure performance, and as regulatory recommendations for testing change. This is an ongoing process, and the science is still developing, so it&#8217;s not perfect yet. There are many factors that affect how pulse oximeters perform, and we can&#8217;t account for all of them with the current testing protocols. As we collect more data from a wider range of devices and diverse groups of people, our understanding and results may change. So, it&#8217;s a good idea to check back regularly for updated information!
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
What does pulsatility amplitude mean?
Pulse oximeters look for the pulsatile (i.e. beating or changing) signal coming from the blood vessels to measure oxygen levels in the blood.  Some devices report the pulsatility in various ways like perfusion index, pulsatility amplitude, percent modulation or other ways. This gives us an idea of how strong the blood flow signal is in the area being measured. For more information, check out our FAQ on perfusion.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
What is differential bias?
Differential bias (sometimes called disparate bias) refers to how much a pulse oximeter’s performance can vary between people with light skin and those with dark skin. To measure this, we compare how much the device over- or under-estimates blood oxygen levels (SpO2) in people with very light skin versus those with very dark skin*.
*Using modeling of skin pigment and SpO2 data, very light skin and very dark skin are defined as having an difference in ITA of 100. Read more about differential bias here.
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;
What is lifetime cost?
*CAUTION: Our Lifetime Cost Calculator is a beta feature, makes many assumptions, is not based on the specific manufacturer/model durability, and costs may not reflect cost of ownership in well-resourced or home care settings. Costs are based solely on the purchase cost for this oximeter and general assumptions about how long a device with this form factor (i.e., fingertip vs handheld vs tabletop) might be expected to last in a heavy-use environment (e.g., a resource-variable clinical setting with frequent use, where damage, loss and theft are possible). These assumptions were informed by input from clinicians across diverse settings. Lifetime ownership cost varies considerably by user and setting. Click the settings button next to the cost to view the formula and adjust these assumptions based on your local context and use case.

Question 7

Is it true that oximeters are not reliable below 70% SpO2?

Accepted Answer

Not necessarily. This commonly recited myth stems from the fact that oximeters are only tested in clinical validation studies down to 70% SpO2 due to the stated requirement by the FDA and also because of limitations by institutional review boards. So, while oximeters are not usually validated clinically below SpO2 70%, and oximeter performance becomes less accurate at lower SpO2, oximeter readings below 70% should not be disregarded. Widely inaccurate SpO2 values below ~SpO2 70% are more often due to poor signal in the context of a critically ill patient with poor perfusion, rather than an inherent limitation in inherent oximeter technology. 
Keywords: accuracy, inaccurate

Question 8

What does the CE marking mean?

Accepted Answer

The CE marking is a mark that manufacturers must obtain in order for their devices or products to be sold within the European Union (EU). Each company has a notified body that is paid by the company to ensure ISO compliance. The company must demonstrate to the notified body that the device meets the desired ISO standard. This marking signifies that the product is in compliance with European health, safety, and environmental protection standards. The CE marking may also be found on products sold outside of the EU that have been manufactured to these standards.
The CE marking is required for medical devices, and having it confirms that the device meets essential requirements of the European General Medical Devices Directive and that it is safe. Of note, the CE marking is not a quality indicator or a certification mark, but it does indicate that the device complies with EU regulations and can legally be sold. 
Also, CE marking for pulse oximeters can sometimes be found on devices fraudulently or for reasons other than conformity with ISO-80601 (i.e. the ISO standard for pulse oximetry). Please note that our listing of devices having CE marking status is based on manufacturer product literature and does not necessarily indicate conformity with ISO-80601.
 
References: European Commission CE Marking
Keywords: CE, marking, European general medical devices directive, EEA

Question 9

What does the IP rating mean?

Accepted Answer

An IP rating, also known as the ingress protection rating, is the rating of a product’s ability to withstand liquid and dust. IP ratings were defined and developed by the International Electrotechnical Commission (IEC), and are recognized all over the world.
The IP code is composed of two numbers:

The first number rates the device’s protection against solid particles (i.e. dust). It is rated from 0 (no protection) to 6 (full protection).
The second number rates the device’s protection against liquids. It is rated from 0 (no protection) to 9 (protection from high-pressure, high-temperature liquid).

References: IEC: IP Ratings
Keywords: IP, ingress protection, dust, liquid, IEC

Question 10

What is ISO and what standards exist for pulse oximeters?

Accepted Answer

The International Organization for Standardization (ISO) is a worldwide standard-setting body. It is composed of representatives from several national standards organizations and aims to provide common standards for technology, agriculture, healthcare, and other manufactured products. These standards are meant to ensure that products and services are safe and of good quality.
ISO has standards for various healthcare-related devices and products, including pulse oximeters. ISO 80601-2-61:2017 describes the requirements for basic safety and essential performance of pulse oximeter equipment. This also includes standards for the pulse oximeter monitor, pulse oximeter probe, and probe cable extender.
References: ISO: Pulse Oximeter
Keywords: ISO, international organization for standardization

Question 11

What is the average root mean square error (ARMS)?

Accepted Answer

The Average Root Mean Square error or root mean square deviation (known as ARMS) is a commonly used metric for pulse oximeter oxygen saturation (SpO2) performance and is used by regulatory agencies to determine how well a pulse oximeter performs. Sometimes referred to as “Accuracy Root Mean Square error,&#8221;  ARMS is the square root of the mean of the squared deviations between the pulse oximeter SpO2 measurement and the gold-standard functional oxygen saturation (SaO2) measurement obtained from an arterial blood sample analyzed on a co-oximeter. ARMS approximates the mean absolute deviation (MAD) between SpO2 and SaO2. It is always a positive value, so it cannot indicate whether SpO2 is negatively or positively biased for SaO2. Higher ARMS implies worse performance. ARMS is best referred to as a measure of performance (rather than accuracy). 
𝐴⁢𝑅⁢𝑀⁢𝑆= √(𝑏⁢𝑖⁢𝑎⁢𝑠^2+𝑝⁢𝑟⁢𝑒⁢𝑐⁢𝑖⁢𝑠⁢𝑖⁢𝑜⁢𝑛^2)
Bias describes whether the SpO2 over- or underestimates the true SaO2 (i.e. magnitude and direction of error relative to the bullseye of the target). Precision describes how much random error there is (i.e. magnitude but not direction relative to the bulleye of the target). See figure 1 below. 
If either or both components of ARMS (bias or precision) is very large, then ARMS will be greater than the acceptable limit and will not be considered an acceptably performing pulse oximeter. Conversely, a device can have no error in one of the components but still have error in the other component. 
Several examples in the table and figures below help illustrate the challenges in interpreting ARMS and applying this metric at the clinical bedside.  For example, if there is no bias and only imprecision (i.e. random error), then an ARMS of 3% implies that most of the time a pulse oximeter SpO2 of 90% is within + 3% (87%-93%) of the true oxygen saturation (SaO2, as measured by a blood gas co-oximeter). However, if bias (e.g. positive bias in patients with darker skin pigment) is also present, then interpreting the ARMS is more difficult.
The 2013 guidance from the FDA recommends an acceptable limit of ARMS ≤3% for transmittance devices and ARMS ≤3.5% for reflectance devices or ear clip devices. The latest 2025 Draft Guidance from the FDA recommends ARMS <3% for all devices, with the addition of a 95% confidence interval. It is important to recognize that even a small difference in ARMS between devices, especially if concurrent random error and bias exist, can reflect a large difference in their performance. See Figure 2 below from Sjoding et al 2022. 
The figure above is reproduced from Sjoding et al 2022: &#8220;Distribution of arterial oxygen saturation and saturation rate <88% when the pulse oximeter is reading at 92%. (A) Pulse oximeter with 2% random error. (B) Overlaid distribution of a pulse oximeter with 2% random error and 1% bias. (C) Overlaid distribution of a pulse oximeter with 2.5% random error and 1% bias.&#8221;
Sjoding et al, Am J Respir Crit Care Med, 2022; This figure is open access and distributed under the terms of the Creative Commons Attribution Non-Commercial No Derivatives License 4.0 Copyright © 2023 by the American Thoracic Society

References:
2013 FDA Guidance
2025 FDA Draft Guidance
Hess, Respiratory Care, 2024
Clinical Application of ARMS in Pulse Oximetry – Clinimark June 2021; 
Keywords: bias, standard deviation, Arms, ARMS

Question 12

What is the meaning of ‘substantially equivalent’ for FDA approval?

Accepted Answer

To receive FDA approval on a new device intended for human use, a premarket submission called a 510k must be made to the FDA. This must demonstrate that the device is substantially equivalent to a legally marketed device, meaning that it is just as safe and effective.
According to the FDA, a device is considered “substantially equivalent” if the following criteria are met:

Has the same intended use as the predicate device; and
Has the same technological characteristics as the predicate;
OR
Has the same intended use as the predicate; and
Has different technological characteristics and does not raise different questions of safety and effectiveness; and
The device is demonstrated to be as safe and effective as the legally marketed device

Once this information has been submitted to the FDA, the FDA will determine whether the device is safe and effective through a thorough review process, including evaluation of performance data and technological characteristics. In the US, a device may not be marketed until the FDA determines that substantial equivalence is present.
References: FDA Premarket Notification 510(k)
Keywords: substantially equivalent, SE, FDA, 510k

	Konica Minolta CM 700d · Brochure · Manual	Courage + Khazaka CL 400 · Brochure	Delfin SkinColorCatch
Device mechanism	Spectrophotometer	Tristimulus colorimetry with Lab color system	Visible spectrum reflectance colorimeter
Dimensions	· Dimensions: 73 x 211.5 x 107mm · Aperture: 3mm or 8mm · Weight: 550g (without cap or batteries)	· Dimensions: 13cm · Aperture: 5mm · Weight: 85g	· Dimensions: 198 x 40 x 35mm · Aperture: ? · Weight: 145g (with batteries)
Illumination	· Light source: pulsed xenon lamp with UV cut filter · Illuminated area: 6mm vs 11mm depending on aperture · Angle: 8* viewing angle, 2* or 10* observer angle · Wavelength range: 400-700nm	· Light source: 8 LEDs arranged circularly · Illuminated area: ~17mm · Wavelength range: 440-670nm	· Light source: 3 white LEDs arranged circularly · Illuminated area: 0.3cm^2 · Angle: 45* to minimize gloss · RGB range: 25-246 Peak wavelengths: 620/540/460nm
Power	· Special AC adapter · 4 AA batteries	· Plug into MPA device	· 2 AA batteries
Measurements	· Eligible ISO 17025 certification · L, a, b* · L, c, h* · Yxy · XYZ · Munsell (Hue, Value, Chroma) · Color differences	· XYZ · RGB · L, a, b* index values (do not fully correspond to ISO) · Calculates ITA	· Melanin index, Erythema index (insensitive to each other) · L, a, b* index · Calculates ITA · L, c, h* · RGB
Reliability	· Zero calibration & White calibration functions · Spectral reflectance: SD <0.1% · Chromaticity value: SD <∆Eab 0.04 · Inter-instrument agreement: <∆Eab 0.2 (for 8mm aperture) · Short-term repeatability: higher with continuous measurements and larger aperture size	· Calibration check function · Measurement error: +/- 5% · Interobserver reliability: ICC 0.79-0.97 moderate/good (Van der Wal et al 2013)	· Calibration check function · Not affected by ambient light due to direct skin contact · Optical orifice designed to minimize pressure induced skin blanching
Data transfer	· Mini USB cable · Bluetooth?	MPA device dependent: · Standalone display · Mini USB cable · Wireless	· Bluetooth cable (wireless receiver unit)
Software	CM-SA Skin Analysis software · Brochure · Manual (Windows PC)	MPA CTplus software · Brochure 1 · Brochure 2 · Includes RHT 400 for ambient temperature and humidity MPA System software (old) · Brochure (Windows PC)	Delfin Modular Core (DMC) software · Includes RoomSensor for ambient temperature and humidity (Windows PC)
Data	· Measurements above · Melanin index · Hb index · Hb SO2 index	· Measurements above	· Measurements above · Tabulated by Sites, Sessions, Subjects, and Instruments · Note: Automatically takes measurements in order of Session instead of Site which is less reliable (requires removing & reapplying probe at each site)
Graphics	· Hue / Value graph · Hb index / Melanin index graph	Yes	Yes
Export to CSV	· Data above · Spectral reflectance (400-700na)2 · Does NOT include ITA: ITA = arctan((L-50)/b)* · Does NOT include RGB: Nix Color Converter	Yes	Yes · But exports each parameter as separate file · Can export data horizontally or vertically
Literature		· C+K	· Delfin

Study	Devices
Wang et al, 2018	CM-700d Spectrophotometer (Minolta) PR-650 Telespectroradiometer (PhotoResearch
Globale Dermatologie Meng et al, 2020	Chromameter (Minolta) Mexameter (C+K) DermaCatch (Colorix)
Matias et al, 2015	Mexameter MX-18 (C+K) Colorimeter CL-400 (C+K) Antera (Miravex)
Van der Wal et al, 2013	Mexameter MX-18 (C+K) Colorimeter CL-400 (C+K) DSM II ColorMeter
Measurement of Skin by Agache	Chromameter (Minolta) Derma Spectrophotometer (Cortex) Mexameter (C+K) DermaCatch (Colorix)
Baquie & Kasraee, 2014	Mexameter MX-16 (C+K) DermaCatch (Colorix)
Uter et al, 2013	Chromameter CR-300 (Minolta) Reflektometer RM-100
Barel et al, 2001	Visi-Chroma VC-100 Chromameter CR-200 (Minolta)
Kerckhove, 2001	Chromameter CR-300 (Minolta)

Pulse ox validation & certification