No cell aggregates were generated from wild-type splenocytes. No direct comparison was made of the intensities of green fluorescence of cell aggregates with those in oct-gfp embryos or ES cells. I cannot state with certainty whether the green and red fluorescence was autofluorescence. RT-PCR analysis for GFP expression showed significant expression in several aggregates, but not in others that had green fluorescence; these data were very preliminary and thus are not shown. This examination focused on the multipotency of cell aggregates generated by Obokata using a chimeric assay, since this was the central feature of the STAP phenomena. Other data were only preliminary given the time constraints under which these experiments were performed, as described in Discussion.

To make chimeras, cell aggregates were prepared with cag-gfp splenocytes, thus GFP expression or green fluorescence cannot be used as a measure for the selection of cell aggregates. For this reason, they could only be selected by cell cluster morphology. In the present study, the selection was dependent entirely on Obokata’s judgment. If she had succeeded, our plan was next to ask her to describe “cell cluster morphology” precisely.

Many embryos injected with cag-gfp cell aggregates at 8-cell stage and cultured for one day to the blastocyst stage were examined for the presence of green-fluorescent cells, and such cells were found to be present.

Chimeric extent was examined at E9.5 or E8.5 in whole mount. The retracted Nature papers show extensive colonization of the cells (Fig. 4 in the Article and Fig. 1 and Extended Data Fig. 1 in the Letter). The article reported eight chimeric embryos, showing more then 50% coat color contribution, of 48 chimeras obtained; these animals yielded “STAP”-derived offspring. This was the central finding in the now-retracted STAP reports. However, in the present study, no chimera equivalent to those in Fig. 4 in the article and Fig. 1 and Extended Data Fig. 1 in the Letter was obtained, nor were any chimeras obtained showing more than 50% contribution to coat pigmentation. Indeed, no chimera showing more than 1% contribution was obtained. I have now revised the text accordingly. We have not examined the limit of detection (minimum number of cells) with the cag-gfp mouse line used, since this was not the point of the present study. However, I believe it to be the case that if dozens of cells had been present together in any tissue, they would have been detectable in whole mount at E9.5 or E8.5.

Best regards,
Shin Aizawa

Irene de Lazaroさんからの質問とその答え

Reviewer Report
22 Aug 2016 | for Version 1

Irene de Lazaro, Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, UK

format_quote Cite this report
speaker_notes Responses(1)
Two 2014 Nature papers authored by Obokata et al. described a novel protocol for the generation of pluripotent cells from somatic cells, based on their exposure to stress-triggering conditions such as mechanical disruption or acid treatment. Such process was named as stimulus-triggered acquisition of pluripotency (STAP), but the studies were soon retracted on the grounds of several confirmed experimental errors and indications of research misconduct. The present work by Dr. Aizawa describes a systematic study aiming to clarify whether STAP putative pluripotent cells – generated by Obokata under the supervision of RIKEN’s Scientific Validity Examination Team – contribute to the development of the mouse embryo and hence can be considered bona fide pluripotent cells.

The conclusions reached in this study – that the claims for STAP cells functional pluripotency are not reproducible – are clearly supported by the data provided, since STAP cell contribution was not observed in any of the recovered embryos. The number of embryos injected with STAP cells to investigate their contribution to different tissues are satisfactorily high. In addition, the design of the study is very systematic and accounts for several potential sources of variability in the chimerism outcome (i.e. source of stress stimulus, technique utilised to cut the cell clusters before microinjection, stage of the embryos at injection and recovery time). Nevertheless, please find below a few suggestions and clarifications that in my opinion could enrich the study by Dr Aizawa.

Experimental design:
In the retracted studies by Obokata et al., CD45+ spleen cells were selected by FACS as the source for STAP cells generation. In the present study, sorting for CD45 was omitted, and the commercial product Lympholyte that reportedly allows specific isolation of lymphocytes was used instead. Being this an alteration of the original protocol under investigation, which may introduce differences in the nature of the starting cell population, it would be beneficial if the author could explain any reason behind this change.

It is mentioned that the CAG-GFP transgenic mouse line utilized for chimerism experiments was different to that used by Obokata et al. in the previous, now retracted, studies. Was there a reason to select a different line?

In the original Obokata et al. studies, clusters of STAP cells were injected into E4.5 embryos. However, in the present study the injection was performed into embryos at the E2.5 or E3.5 stage. Was this parameter changed to attempt a higher degree of chimerism? An explanation for this change in the experimental setup would also be desirable for clarity.

Presentation, treatment and discussion of data:

Frequency of cell aggregates from Oct-GFP transgenic spleen after low pH treatment (Table 1): although it is stated in the text that “no apparent difference was found in the frequency of green fluorescence signal under either of the low-pH conditions (HCl or ATP) or genetic background of mice” this statement would be stronger if it was supported by an statistical test. Did the author conduct statistical analysis on these data?

Green and red fluorescence of cell aggregates: the author seems to imply that the detected signal was a result of autofluorescence, which has been in fact pointed out by other researchers attempting to reproduce the work (Tang et al. 2014; De los Angeles et al. 2015; see last comment below). However, this is not clearly stated in the text. In addition, doubts concerning green autofluorescence could be easily vanished by using an anti-GFP antibody or by measuring the levels of GFP mRNA by qPCR or GFP protein by Western Blot. I strongly encourage the author to perform such studies if samples are still available since they would definitely clarify the issue. The inclusion in the study of cell aggregates derived from a wild-type mouse line would have also avoided this ambiguity.

Genetic background of mice to generate STAP cells for chimerism studies: It is first stated in the text that the CAG-GFP mice involved in chimerism studies were bred on a C57BL/6 homozygous background. However, it is later highlighted in the text, as well as in Table 2, that both C57BL/6 abnd F1(C57BL/6x129) were included. Since those were also the backgrounds selected to maintain Oct-GFP transgenics for the cell aggregates assay, this is confusing needs clarification from the author.

Results included in Niwa, 2016: the author refers several times to the results reported by Niwa (Niwa, 2016), who also investigated the reproducibility of the STAP phenomenon for the Scientific Validity Examination Team in RIKEN . It is however not clearly specified whether Niwa’s studies were conducted on the same STAP cells produced by Obokata under strict supervision. Such clarification is important, since qPCR, immunostaining and FACS data are discussed, but not shown, and the reader is directed to Niwa’s work.

At least two other independent studies, carried out at institutions outside RIKEN, have aimed to clarify the STAP controversy (Tang et al. 2014, De los Angeles et al. 2015) and reached similar conclusions to those presented in this article. In particular, the issues on autofluorescence have been extensively scrutinized in De los Angeles et al. I believe that a brief discussion of the observations compiled in such studies would strengthen the present manuscript.

Typographical error: “in the intestests of clarifying the scientific record” (page 2 of 8).
1. Tang MK, Lo LM, Shi WT, Yao Y, et al.: Transient acid treatment cannot induce neonatal somatic cells to become pluripotent stem cells.F1000Res. 2014; 3: 102 PubMed Abstract | Publisher Full Text
2. De Los Angeles A, Ferrari F, Fujiwara Y, Mathieu R, et al.: Failure to replicate the STAP cell phenomenon. Nature. 2015; 525 (7570): E6-E9 Publisher Full Text
3. Niwa H: Investigation of the cellular reprogramming phenomenon referred to as stimulus-triggered acquisition of pluripotency (STAP).Sci Rep. 2016; 6: 28003 PubMed Abstract | Publisher Full Text
Competing Interests
No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Responses (1)
Shinichi Aizawa, RIKEN, Kobe, Japan
Dear Dr. Irene de Lazaro,

I thank you for your comments. The manuscript was revised incorporating your suggestions. My responses are as follows:

There was no FACS cell sorter in the laboratory in which Obokata performed the set of supervised experiments reported here. She had previously obtained “STAP” cells using splenocytes prepared using Lympholyte-M, so we sought to determine whether she was able to repeat this in the present study. If she had succeeded, our plan was next to generate STAP cells using CD45+ cells sorted by FACS.

The origin of the cag-gfp transgenic mouse line used in the retracted Nature papers is unclear, and was not reported in the papers. Dr. Wakayama informed us that he generated the cag-gfp mouse line himself while at the University of Hawaii, but we did not make a formal investigation into this. The mouse line was no longer maintained in the animal facility of CDB and was not available to us. Alternatively, the cag-gfp mouse line may have been actually an Acr/cag-gfp mouse line (Nakanishi et al., Genomics 80, 564-574 (2002)) as suggested in the report by Konno et al (Konno et al., Nature 525,E4-5 (2015). However, we only became aware of this possibility at the time of that report, which was after the start of Obokata’s replication attempt. In any case, the cag-gfp mouse line reportedly used in the original STAP reports is different from the cag-gfp mouse line (Okabe et al., 1997) we used in the present study. It is nonetheless difficult to conceive how the difference in cag-gfp transgene might affect the efficiency of “STAP cell” production and chimera generation.

In Fig. 4a of the retracted Nature article, the embryo being injected with “STAP” cells clearly has a zona pellucida. However, E4.5 embryos typically no longer have this structure. In the absence of zona pellucida, injection is practically impossible. We note that E0 is generally defined as 0:00 am of the day when the plug is identified, and suggest that E4.5 may be a typographic error for E3.5. Alternatively, Dr. Wakayama may have artificially delayed the development of the embryo; however, this was not reported in the retracted Nature paper.

We have now included a statistical analysis (t-test), which indicates that the efficiency of cell aggregate formation is significantly different between ATP treatment and HCl treatment in the C57BL/6 background. However, the difference is slight. We have revised the manuscript accordingly (Table 1 and page 5 in the text).

This study focused on the multipotency of cell aggregates generated by Obokata using a chimeric assay as this was the central feature of the reported “STAP” phenomena. Given the time constraints of this study, other data were necessarily limited, as noted in the Discussion. As it was not the focus of the present study, I cannot state definitively that the red fluorescence observed was autofluorescence, although I feel that this is highly likely. RT-PCR analysis for GFP expression showed significant expression in several aggregates, but not in others that showed green fluorescence; however, these data were preliminary at best and are not presented.

The effects on both cell aggregate formation and chimeric potency of the spleens’ genetic background were examined in the C57BL/6 and F1(C57BL6 x 129) background. It is well known that ES culture is strongly influenced by genetic background. Both of these backgrounds were used in the retracted Nature papers. I have now revised the manuscript (page 4 and page 6) to clarify this point.

The cell aggregates in Niwa’s report were prepared by Niwa, not by Obokata.

The two reports are now cited and briefly discussed (page 8–9). These works did not examine multipotency by chimeric assay, and the most important issue of the present report is that cell aggregates prepared by Obokata herself did not exhibit multipotency in chimeric assays.

Best regards,
Shin Aizawa