How can we measure and communicate the impact of science?

Prachee Avasthi; Megan L. Hochstrasser; Jasmine Neal; Robert Roth

doi:10.57844/arcadia-74d6-huec

Open question Feedback requested Reimagining scientific publishing

Published on Mar 29, 2024 by Arcadia Science

How can we measure and communicate the impact of science?

How can we measure the true impact of science? We're seeking feedback on indicators of the utility and rigor of publications beyond traditional journal metrics. Your input will help shape the future of our publishing experiment.

How can we measure and communicate the impact of science?

Purpose

Traditional signals of scientific quality — journal titles, closed peer review, and impact factors — don’t fully reflect the utility and rigor of scientific work. Since our publishing platform exists outside of traditional systems, these signals wouldn’t be available to us or those running other open science initiatives even if they were reliable. There are plenty of other challenges faced by scientists publishing both inside and outside of traditional systems too, including discoverability, tracking reuse, determining ways to re-evaluate quality over time when sharing living documents, and others.

We need new ways to evaluate science that better capture its true value and can be displayed directly on a scientific output so researchers can more easily utilize and expand on it.

The questions we’ve laid out at the bottom of this pub serve as conversation starters to creatively reimagine how we measure scientific efforts, especially forays into open science. We hope this dialogue will inspire us and others to develop open resources and tools that support science sharing for all collaborators in this space. Stay tuned for future publications where we'll share insights from our experiments with different reuse metrics.

Read on for background on what we’ve tried so far, or jump straight to the questions and start a dialogue.

This pub is part of the model creation effort, “Reimagining scientific publishing.” Visit the project narrative for more background and context on our approach to publishing.

Share your thoughts!

Feel free to provide feedback by commenting in the box at the bottom of this page or by posting about this work on social media. Please make all feedback public so other readers can benefit from the discussion.

Motivation

Research is most impactful when it’s findable, accessible, and useful. Thus, a major goal of our publishing experiment is to release rigorous work that we and others can replicate and build upon. This is why we publish our science openly — complete with all the data, code, methods, and other information necessary to reuse and evaluate it.

Since we began iterating on our publishing framework [1], we’ve seen some early signs of success within and beyond Arcadia: community-driven GitHub contributions, reuse of our strains/reagents, alterations to preprints based on our modular reviews, and open feedback beginning to shape the way we think about our science.

Despite that, we are still working to identify all the indicators that will let us understand if we’re meeting the goals of our publishing experiment.

Aims for our publishing model
As described in our “Reimagining scientific publishing” narrative, we’ve identified three key qualities to maximize in our publishing experiment.
Speed: Sharing smaller, more modular pieces of research as we go will let people learn about and use our findings quicker and will accelerate scientific progress as a whole.
Utility: By breaking from rigid journal formatting, we can maximize usability and explore interactivity. Our data will be easy to find, access, use, and repurpose in ways we can’t predict.
Rigor: We want public comments from anyone. Expertise lives everywhere, not just where you look for it. With diverse feedback and iterative engagement, our work will be improved and we can meet community needs. A key signal of rigor that we’re focusing on is reuse. Are others able to replicate and build upon the work we release?

What do we measure so far?

Strong metrics can inform our internal strategy and, when shared publicly, provide the people encountering our work with a means to quickly and effectively evaluate its usefulness. While we don’t yet communicate any of this data to readers, we currently gather and analyze a variety of quantitative metrics, including:

Metrics about individual pubs

PubPub:
- Pageviews
- Unique visitors
- Country of visitors
- PDF downloads
- Number of public comments
- Traffic sources
Citations (via Google Scholar)

Metrics about linked resources

Protocols.io metrics:
- Views
- Runs
- Exports
- Comments
GitHub metrics:
- Unique visitors
- Unique clones
- Number of pull requests (forthcoming)
- Number of issues (forthcoming)
Zenodo metrics:
- Views
- Downloads

We also gather qualitative metrics that could indicate utility and rigor, such as responses to the survey that you'll find at the bottom of every pub and public comments on our platform.

Tracking this data is helpful for researchers to determine who their work reaches, its quality, and how it’s used. Still, it doesn’t help readers understand if the work is rigorous or useful to them. We’re developing ways to display metrics on our publications that reflect utility and rigor. But we’re still figuring out the best form for that to take. If you have thoughts on what would be useful for you to see, please leave a comment here or on question number one!

What else do we want to measure?

While useful, many of the metrics above simply indicate reach (e.g. pageviews) or move at a pace that doesn’t match ours (e.g. citations). Reach can be a useful marketing metric, but it doesn’t reveal much about our science or its impact on its own. We need new ways to assess the utility of our work, ensure the feedback loop is fast enough to improve it, show scientific value to readers so they can quickly assess if a pub will be useful to them, and indicate how public feedback influenced our science.

What could we measure that would be more informative, and how would we collect that data efficiently? What parts of a pub is a given researcher using (code, protocols, data, etc.), and are they usable? How can we tell if our tools directly or indirectly inspire future work?

Many organizations and individuals are innovating in this realm; we aren’t alone in this struggle. PLOS developed a set of “Open Science Indicators” to better understand the uptake of open science practices throughout the scientific ecosystem [2]. Recognizing the limitations of journal metrics, researchers in various fields have also proposed alternative frameworks. For example, the “Scientific Impact Framework” seeks to evaluate the influence of a piece of research using quantitative and qualitative metrics across multiple domains, from dissemination to implementation in public health policy [3]. And, with the rapidly expanding role of social media in facilitating scientific discussion, a variety of groups are working to gain new insights into who specific outputs are reaching and the dialogue surrounding them [4].

How might we continue to innovate together, share resources to document these efforts, and evaluate their outcomes?

Our goal is not to create a different impact factor — we recognize that scientific value cannot be boiled down to a single number and believe it should be conveyed through an array of different indicators. With rapid advances in AI and language processing, we as a science community are well-positioned to build nuanced, useful, and easy-to-parse methods to measure this.

Let’s have a public conversation about how to identify and communicate qualitative and quantitative signs of rigor, utility, and reuse. We hope this forum will spark ideas for us and others to develop open tools or projects that will make it easier to evaluate scientific impact.

Weigh in!

While we’d love any thoughts or feedback you have, we’ve decided to focus on a small set of specific questions to provoke discussion:

In the absence of editorial decisions, what data, tags, summaries, or other information would help you quickly determine if a piece of research is relevant to your interests and use cases?
What existing or novel measures could indicate that research…
- is verifiable (i.e., can someone verify that the work is rigorous and replicable)?
- has been verified?
- has been expanded or built on?
How might we effectively track the ways a given piece of research is reused (i.e., others following up on a finding, applying the knowledge provided, using a tool, etc.)? Are there existing tools that do this well?
What shared benchmarks should the open science community consider to evaluate the success of different publishing models?

If you like the idea of providing open feedback, consider weighing in on the questions above and signing up for our pub digest to get notified when we release new work! Remember, you don’t need to write an entire review — we encourage in-line, modular feedback. Even a quick comment is appreciated!

How can I join the discussion?
We hope you’ll respond publicly to our questions above by selecting/highlighting the question you’d like to answer, clicking the comment icon, and typing in your thoughts (as shown in the GIF below)! You’ll need a PubPub account to do this, but it’s free and quick to make one. Here’s a quick tutorial on how to comment.

Watch our follow-up discussion

On May 29, 2024, we held a live, interactive discussion with ASAPbio to discuss the topics in this pub. Some comments from the discussion have been posted in the “Weigh in!” section, and you can view the entire recording below. We’re still looking for feedback — feel free to add your own thoughts based on our discussion here!

Methods

We used ChatGPT to provide feedback on draft text and to suggest wording ideas and then used its responses as inspiration to improve the draft without directly using any of its phrasings.

Share your thoughts!

Provide feedback

Prachee Avasthi

Critical Feedback

Megan L. Hochstrasser

Editing, Supervision

Jasmine Neal

Writing

Robert Roth

Conceptualization, Writing

VR on Oct 16, 2025

You are correct in saying that "Traditional signals of scientific quality — journal titles, closed peer review, and impact factors — don’t fully reflect the utility and rigor of scientific work". However, your publishing model changes this from "not fully" to "not at all, not even a little bit".

The traditional publishing model seeks to at least partially validate the quality and rigor of scientific output using the publisher's, editors', and peer reviewers' reputations, particularly in open peer review, as collateral. The traditional model also has a comparatively robust way of dealing with honest errors and demonstrable fraud, as erroneous conclusions and imperfect data remain visible as part of public record, as do any corrections, expressions of concern, retraction notices.

When I cite a Current Biology paper (pick your own favorite journal), I know that the final product is the result of multiple rounds of vetting, at a minimum, at grant agency, institutional, publishing house, and individual peer reviewers' levels. When I cite what effectively is a blog post, I am fully and personally responsible for the validation of the entire data stream, including primary data and code. This requires expertise in the scientific domain in question, at least working understanding of the methodology, and potentially access to complex software tools to, for example, detect image manipulation or plagiarism. And what do I do if the blog post disappears, potentially together with the author, whose name or credentials I may not be able to independently verify, and perhaps with the entire publication platform? All of a sudden, my own research or conclusions are no longer based on any bit of existing knowledge, but rather on a mere memory of evidence of such knowledge. This gets bad quickly; now I need to redo the experiments I thought were part of public record.

Does the traditional model have a visibility problem? I don't think so. PubMed still works comparatively well, and so do search engines. Does the traditional model have a quality control problem? It can slip, but I do not think there is evidence of a systemic failure. Does the traditional model have a reuse tracking problem? Not really. Citation Reports, G Scholar, and other similar tools work reasonably well. Does the traditional model have an access problem? Yes, absolutely! Paywalled publications still represent the bulk of scientific output.

Your motivation is clear, but your model has demonstrably lower visibility, quality control, and reuse tracking characteristics. It does move the accessibility needle in the right direction, although so do true open access publications and preprint archives. Without rigorous controls characteristic of the tradition model, a blog post is just that, a blog post. It may be written by a scientist, and data may be legitimate, yet still, just a blog post. There is nothing intrinsically bad about it; after all, "Metamorphosis Insectorum Surinamensium" was self-published, i.e. effectively a blog post. But then again, so was "Natural Cures "They" Don't Want You to Know About".

Prachee Avasthi on Oct 18, 2025

The idea that one should read a Current Biology paper and a preprint with different levels of critical thought is deeply concerning. No degree of journal peer review can, or should, absolve readers of their responsibility to think critically. It is precisely this permission to abdicate judgment that harms the scientific enterprise.

The notion that a binary decision about quality, based on the private opinions of a few reviewers, represents certainty is a cozy fiction. Reviewers rarely analyze raw data themselves, and most journals don’t even require all underlying artifacts to be shared. Being unaware of the level of scrutiny applied, because it is deliberately obscured, does not make a paper more trustworthy.

Labeling something a “blog post” as a self-evident pejorative misses a key point: the enormous costs of maintaining the journal status quo in exchange for the illusion of quality control. In my view, it is far more damaging to anchor the trajectory of science to “blog posts” with a gold star from a few anonymous reviewers, declared as truth only after great delay, expense, and restricted visibility behind paywalls.

Ultimately, science should be measured by whether it withstands further evaluation and reuse, precisely because neither journal reviewers nor readers conduct such testing upon first read. Any model that claims to serve science must enable, not obstruct, this iterative process of collective scrutiny and refinement.

Sierra on Aug 30, 2025

The proposed reuse-centered metrics are a welcome shift away from static, prestige-based indicators. My question centers on implementation: would this require standardized, cross-platform infrastructure to track reuse events (e.g., GitHub, Zenodo, protocols.io)? Relatedly, how might we separate passive reach (downloads, pageviews) from active integration (successful replication, incorporation into new studies)? An automated, DOI-anchored pipeline that links a pub to its downstream artifacts (papers, datasets, tools, code) could make those signals more robust and auditable—and, ideally, tag the type of reuse so readers can quickly see what was replicated, adapted, or extended.

Robert Roth on Sep 02, 2025

Thank you for your comment, Sierra! These are great questions, and I don't think there's a perfect answer yet. Standardized, cross-platform infrastructure would be great, but I don't know that it's particularly realistic given the number of platforms and avenues that now exist for researchers to share their science. I think we're in a good position to leverage AI tools, provided there is a decent level of machine readability on sites that scientists most often use (and assuming that these sites allow scraping). And I agree that DOIs are probably the best way to track and map these research outputs — but there is still more work to be done to encourage more robust metadata in DOI deposits, making sure that landing pages are truly persistent, etc.

Pavel Chekmaryov on May 22, 2025

sounds great, I'd go for a tool that highlights the currently active technologies that use/derive from the given paper, maybe even graph of dependencies in a timelapse

Robert Roth on May 22, 2025

Love this idea! I'd want to think more about methods to identify those technologies (and monitor them to make sure that it remains current) but I think this could be a very useful output.

Stephen Goldstein on Feb 17, 2025

Selection

gather and analyze a variety of quantitative metrics, including:Metrics about individual pubsPubPub:PageviewsUnique visitorsCountry of visitorsPDF downloadsNumber of public comment

I would add to this a metric for how often people click the doi to copy the link. When I find something interesting, if I’m moving quickly, I’ll copy DOIs and paste them into paperpile to download and return to. Not sure these metrics would capture that.

Robert Roth on Feb 18, 2025

This is a really great idea! Thank you for adding this insight.

Edi Wipf on Mar 11, 2025

Selection

y assess if a pub will be useful to them, and indicate how public feedback influenced our science.What could we measure that would be more informative, and how would we collect that data efficiently? What parts of a pub is a given researcher using (code, protocols, data, etc.), and are they usable?

When considering impact, I often think about how research and discovery seeds future projects, particularly through what gets funded. It would be fascinating to explore how grant funding patterns could be tied to publications, shedding light on their influence on subsequent funded research.

Tracking how many citations and shared keywords from Arcadia’s research appear in successful grant applications could provide valuable insights into its influence on funding trends. Publicly available funding data could be leveraged to draw connections between your work and funded projects across different fields. For this aim as well, incorporation of AI tools to scan and tag publications that cite or adapt your research could help uncover unexpected intersections of reuse, highlighting the multidimensional impact of your outputs.

Another thought is the idea of creating an interactive map that visualizes the lineage of research outputs. This map could trace how work is adopted and expanded upon—not just in grant-funded projects but also in broader contexts such as policy changes, industry innovations, or public health interventions.

These ideas are tied to providing a more comprehensive picture of how research contributes to tangible progress across domains. Moreover, making these insights easily accessible to stakeholders could inspire greater trust and engagement with open science models.

Robert Roth on Mar 11, 2025

These are excellent ideas, and I agree they're strong indicators of impact. Using NLP techniques to explore impact in a more content-based way is something we've been really interested in trying as we publish a greater number of pubs and people reuse our work more often. Are there any particularly compelling approaches you've found? We’ve been reading a lot on this topic, and I’m curious if you (or anyone else reading this) have seen any standout approaches or tried to do this on your own work.

Daniela Liebsch on Apr 01, 2024

Selection

pecific questions to provoke discussion:In the absence of editorial decisions, what data, tags, summaries, or other information would help you quickly determine if a piece of research is relevant to your i

When it comes to relevance, for me, I’ld simply say it is topic based, so a very clear honest summary, keywords, and limitations would be most helpful. If we take traditional abstracts and summaries they are often a bit vague, and advertising rather then stating limitations. For tools, something that outlines possible uses, citations to how it was used (maybe some kind of summary of uses), and setting it apart from other similar tools or even quick comparison with other tools and again specific strength/limitation summary could help.

Robert Roth on Apr 03, 2024

Thank you for your thoughts on this, Daniela! I’m curious — do you find yourself using filtering tools (such as by topic) to list and then find articles, or do you generally do more of a keyword-based search to find specific tools/publications that could be useful? I’m guessing this depends on why you’re looking for publications, but I’d be interested to hear your thoughts.

Daniela Liebsch on Apr 14, 2024

You are right, it depends. Normally if I look for something specific (rather than just browsing), I ld search for keywords, methods, a key aspect of what i want to know, a phrase. (topic for plant people is often just plant biology, haha T.T, so that i use for filtering but it doesn’ narrow it down so much). Sometimes just google, or search images, e.g. if looking for a phenotype images are really helpful or a cool tool sometimes is textpresso where it gives something shown in the context of the text. So some advanced keyword based science search engines (free) combining such features would be cool and publications/preprints/or any science output being in a format that is acessible to such tools. Otherwise, when just browsing it’s much harder to come across something that is interesting but that u are not looking for especially if it is on not so common platforms and different fields.

Daven Northroup-Kuder on Apr 06, 2024

Selection

plying the knowledge provided, using a tool, etc.)? Are there existing tools that do this well?What shared benchmarks should the open science community consider to evaluate the success of different publishing models?If you like the idea of providing open feedback, consider weighing in on the questions above and

I think it would be really helpful to communicate the scope and impact of publications. This could help policymakers and the general public understand the focus and utility of different publications. These are some possible impact and scope questions with associated scale:

‘What is the scope of study?’ [peer reviewers would link related topics covered in this paper (e.g., electrochemistry, protein engineering, etc.)]

‘What is the scale of impact for this paper?’ [a 1-5 rating scale from niche to universal]

‘How accessible is this paper?’ [a 1-5 scale from very niche to easily understood]

In addition, a publication’s impact on various sectors (such as policy, technology, education, medicine, etc) could be assessed.

These benchmarks could be determined by a weighted mixed voting system (similar to rotten tomatoes’ rating system). Peer reviewers and approved readers would give each publication an initial score, and then every reader would have the chance to score the paper's impact and scope. The scores of approved users and peer reviewers would carry more weight than those of the general reader, but the general readers would get to report on how they found the article.

Assessing a paper's scope and impact via weighted crowdsourcing would help assess the subjective response to publications. This impact and scope rating system could easily be added to the existing peer-review process and journal/publication platforms.

Jasmine Neal on Apr 11, 2024

Thanks for your feedback Daven! We definitely agree that it would be helpful to communicate the scope and impact to the reader and this is something we’re actively thinking about! For our other pubs, we currently ask the reader about clarity, utility, replicability, and rigor, but I wonder if we should consider expanding or modifying these questions to help measure impact or scope? E.g. We could also ask the reader which sector this work would be useful for, if they indicate that it’s useful. We’d love to make these (or other) questions more prominent and to display results as the PubPub platform evolves to allow us more control.

I’m curious – would you also find it helpful to see how other scientists or labs are using the work or would the results of a weighted-mixed voting system be more helpful for you? Or perhaps a combination of both would be better?

L. Robert Hollingsworth on Jun 13, 2024

A system similar to what you propose is being tested by SolvingFor (https://solvingfor.org/dsdc-pilot). While I understand the lack of job inventory in academia necessitates some sort of triage, I think that layering impact metrics into science is fundamentally flawed. In the moment, we can assess timeliness of a discovery in relation to immediate follow-up work or the ability to steer the field towards a more correct and away from a less correct hypothesis. However, the true value of an immensely novel discovery might not be apparent for much time. Also, using voting-based metrics can lead to scientific populism and bias towards certain personalities. Metrics in general inevitably lead to behaviors that game the metrics, which could lead to all of the problems of the current journal system. I ask—is there a way to dispense with any sort of aggregated metric system, and instead evaluate science based on its relation to the field and your work? Is there a way to separate contribution from impact, as to discourage task master-based promotion into sustainable long-term scientific positions?

Also—here:

https://research.arcadiascience.com/pub/open-question-measuring-reuse#nr4104hk7ex

Pavithran Narayanan on Apr 07, 2024

Selection

at existing or novel measures could indicate that research is or isn’t rigorous and replicable?How might we effectively track the reuse of a given piece of research (i.e., others following up on a finding, applying the knowledge provided, using a tool, etc.)? Are there existing tools that do this well?What shared benchmarks should the open science community consider to evaluate the success of di

Before attempting to answer any of these questions, I think it is necessary to look at (and probably measure, if possible) the reach and penetration of Arcadia Science in the research community. Only if a significant fraction (used here as a loose term) of the researchers are aware of the company and its work, it will make sense for the impact to be measured. Please note that this reach is different from the reach mentioned in this pub, which is a measure of the views, downloads, comments, etc.

Active Outreach:

The reach of the company could, in many ways, depend on the kind of outreach strategies the company employs. I think what the company currently needs is what I call an “active outreach” strategy. This essentially involves directly reaching out to researchers in a given field through email, in person at conferences, or other relevant means. For example, if the company works on a project on Actin structure, the company needs to actively reach out to the researchers (PIs, postdocs) working in the same field and let them know about the work that it has recently published. This would allow them to know about the work and engage with it as per their preference.

After an initial round of active outreach in a given field, this circle would be expected to grow and the work of the company could be expected to penetrate within the community with time. This would potentially help the pubs to receive more visibility, feedback, and probably get cited in other publications. These would then give the company to have something on the slate to start with upon which it can build useful metrics and approaches to track utility, rigour, and reuse.

Such an active approach may ultimately pave the way for either other researchers to directly engage with the research group at Arcadia Science (so that reuse could be directly tracked through personal communication and ultimately as citations in published works) or the company to develop much robust measures for the reuse of its work.

Jasmine Neal on Apr 08, 2024

Thanks for your feedback Pavithran, you bring up really great points! I’m actively thinking about this aspect of our publishing experiment at Arcadia so I wanted to respond to your comment with some more context and questions. 🤗

Active Outreach

We definitely agree that an active outreach strategy is necessary and I’m glad to say that Arcadia does execute active outreach via social media and email, at conferences, by co-hosting virtual and IRL events and more. We also track our interactions with people in our community to monitor its growth and better understand how people interact with us. Reach is a bit easier to measure for conferences since we can easily compare the number of scientists who joined our newsletter/community with total conference attendance. But how do we measure our reach more broadly? What number can we use to get a sense of the scientific community at large? (I don’t have an answer for this yet but I welcome all suggestions!)

As you suggested, monitoring active outreach and the growth of our community does give us a glimpse into reuse and we can even anticipate citations based on how other researchers respond to our outreach, ask questions etc. However, much of the valuable discussion remains behind closed doors if done via email or buried in threads on X. Oftentimes, a question or suggestion that one researcher may have is shared by another researcher.

One difficulty, especially pertaining to email outreach, is “converting” any given feedback into a comment so that the entire community can benefit from it. At times there is a reluctance to post feedback publicly even when asked. Is this because making a PubPub account is a big lift? Or because there is hesitation with publicly criticizing someone else’s work, even if it’s welcomed and done constructively? Or perhaps comments aren’t the best way to make feedback visible to others? (I’d love to hear your perspective on any of these questions!)

I have many more thoughts and ideas but perhaps this merits an open question pub specifically about engagement…? 🙃

Jasmine Neal on Apr 16, 2024

Selection

ures could indicate that research is or isn’t rigorous and replicable?How might we effectively track the reuse of a given piece of research (i.e., others following up on a finding, applying the knowledge provided, using a tool, etc.)? Are

It may be interesting to display who and how the work is being used in other contexts. Similar to the “works for me” button found on protocols.io, researchers could indicate if the tool or result is “in use” and perhaps elaborate on their use case. This could lead to a dropdown list or other feature that shows all the places the work is currently in use and what it is being used for. Listing the examples of reuse could also drive traffic to the work of other scientists in the community so it may incentivize others to indicate that they’re using a tool, result, etc. Just an idea! :)

Micah Johnson on Jan 13, 2025

To track how a piece of research is reused, I find it particularly helpful to see which other papers have cited it. I also noticed you mentioned having access to data on the types of researchers viewing different publications, making those metrics visible would be a fantastic addition. Even simple bar graphs or a dedicated section showing viewer backgrounds would be beneficial. Since you’ve already implemented open sharing (which I really appreciate), why not also highlight which posts receive the most engagement?

Hui Xin Ng on May 23, 2024

Selection

Before we can evaluate success, we need a clear definition of "success".
I believe that there is no single definition of success, however, when it comes to measuring the impact of science because that definition evolves depending on the desired outcome. The desired outcomes differ depending on the role of the person or the organization.

For instance, a government official tasked with creating a new public health program might want to learn about the outcomes across multiple interventions from an existing case study, or potential outcomes from a health economics study. Downloads and views are not sufficient to reflect the impact of a publication - because what follows from reading the publication might be more nebulous - e.g., the government official then contacts the lead authors of the study for engagement/consultation. The publication may facilitate the initial point of contact, but the impact goes beyond a single metric we can measure - I imagine one of the ways to measure such impact (if we narrow the scope to comparing publishing models) is to ask how effectively X publishing platform/model itself resonates with non-scientists? If we are only measuring metrics like downloads, it implies a single direction of information flow. A successful model, I reckon, will be bidirectional, and we need to find ways to perhaps quickly prototype a model for publishing/sharing science and see how it resonates with people who are not directly involved in scientific knowledge production.

That may be beyond the scope of the current discussion. Going back to measuring success across publishing models, I think we need to consider the ease of adoption of a new publication/research-sharing model. This might be a chicken and egg problem where we don't have enough people trying out a new model and hence we can't evaluate it. But I suspect there might be subgroups of people who face similar obstacles to adopting a new publishing model.

My thoughts on what to measure now are quite nebulous and I will expand them in the coming weeks. But to close it off, for now, here's one question I'd like to pose for discussion: Who uses science/research findings, and for what purpose? Answering this question will help us identify different metrics to measure other than metrics like number of downloads.

Melissa Steele-Ogus on May 22, 2024

Selection

ack you have, we’ve decided to focus on a small set of specific questions to provoke discussion:In the absence of editorial decisions, what data, tags, summaries, or other information would help you quickly determine if a piece of research is relevant to your interests and use cases?What existing or novel measures could indicate that research is or isn’t rigorous and replicabl

Graphical abstracts are always really valuable to me–I can quickly scan to see if the subject and findings of the paper are something that applies to my interests.

Melissa Steele-Ogus on May 22, 2024

Selection

help you quickly determine if a piece of research is relevant to your interests and use cases?What existing or novel measures could indicate that research is or isn’t rigorous and replicable?How might we effectively track the reuse of a given piece of research (i.e., others following u

It seems like this is already being implemented here, but having a comments or discussion where people can compare notes seems really key, especially for those of us who work in really niche fields. Negative results may not be publishable, but knowing about them may save your colleagues or future researchers a ton of work.

Anna Hatch on May 29, 2024

Selection

a, tags, summaries, or other information would help you quickly determine if a piece of research is relevant to your interests and use cases?What existing or novel measures could indicate that research…is verifiable (i.e., can some

I look at figures and methods and to see if results match conclusions.

Shaurya Chanana on Sep 02, 2024

At the surface level, reading the abstract _should_ suffice but it often doesn’t because abstracts are usually very technical. Alternatively, reading the results and skimming the discussion help but, that’s a lot of reading and no one has time for that.

One way could be to have an AI-based summary of the article. The downside is that the summary could be partially hallucinated and probably unreliable. We could add some conditions like making sure words in the summary appear in the paper etc.

Another way is to write a “narrative” at the beginning of the paper. So, if the paper is about a novel clustering algorithm that claims to make it easier to see unrelated points in a space better, the paper-writer could try writing a story around what kinds of problems a reader could use this method for.

A third way is to force paper-writers to simplify their language and add a graphical abstract.

Daniela Saderi on May 29, 2024

Selection

ur interests and use cases?What existing or novel measures could indicate that research…is verifiable (i.e., can someone verify that the work is rigorous and replicable)?has been verified?h

Methods section… is the protocol detailed enough to suggest how to replicate an experiment?

Robert Roth on Jun 06, 2024

Definitely something to look for! I’m curious if you’ve found any sort of indications of this to be helpful (like the ‘Works for me’ button Protocols.io, for instance) or any kind of public commenting (on Twitter, bioRxiv, etc.) that has indicated it’s detailed enough? Or does it generally require more of a personal, manual review for you to feel confident in its level of detail?

Luis Goicouria on May 31, 2024

Selection

I love commentaries because they often effectively communicate technically niche or complex data to a broader audience, provide more unbiased discussion of the significance of the findings, and occasionally provide criticism of the design and implementation of the study. I would imagine that finding a third party, uninvolved in the production of the publication in question, to write a commentary would provide a valuable tool in interpreting the use cases and limitations of the findings and determining how relevant the findings are to my interests.

Luis Goicouria on May 31, 2024

Selection

esearch…is verifiable (i.e., can someone verify that the work is rigorous and replicable)?has been verified?has been expanded or built on?How might we effectively track the ways a given piece of research is reused (i.e., others foll

Generally, the tracking of citations of your publication is used to determine the extent to which your work is being verified or built on. This, however, isn’t sufficiently comprehensive (as not all pertinent research may find your publication, especially if there are barriers to finding or accessing your work). I would imagine that a supplemental measure would be to track publications that cite the same publications that you cited in your publication. Other publications that use a certain threshold of shared citations are more likely to be pertinent to your research, even in the event that they are not citing your work specifically.

Robert Roth on Jun 06, 2024

Thank you for contributing to the discussion and for your note on barriers to finding or accessing the work — discoverability/accessibility is a key part of this puzzle, especially outside of more traditional channels.

I’m curious if you or anyone who sees this comment has tried out any tools or seen other projects that try to get at this concept and what you thought of them/if you found it helpful. I suppose this is similar to the concept of bibliographic coupling? Correct me if I’m wrong.

It also makes me wonder what else we could add to that analysis to more easily get at how pertinent the shared citations might be to the original publication (and maybe even uncover some ‘hidden’ evidence of reuse/influence of the original publication). Could be an interesting use case for LLMs, or combining some of the work being done on contextual citations.

James Boyd on Jun 01, 2024

Selection

t of specific questions to provoke discussion:In the absence of editorial decisions, what data, tags, summaries, or other information would help you quickly determine if a piece of research is relevan

As a general issue in scientific publishing, I believe that literature search friction and duration can be improved if publications are tagged both by subject and by common, predictable use cases. I’m often looking for particular perspectives/angles on a given subject, and often have to infer the kind of perspective/angle that a paper provides only after reading it in some depth. Here are some examples of what I might personally seek during a literature search:

New results/data that corroborate/support a given theory or investigation
New theories/models with better explanatory power
New results/data that challenge a given theory or investigation
Strategic and/or historical commentary
Summaries and expository work
Reproduced/replicated studies
Methodological innovations
Reviews or criticism

I was happy to see the “Negative Data” and “Open Question” tags on Arcadia publications; they’re a step in the right direction. I personally favor more tags, though I imagine that avoidance of over-tagging is an important curation issue.

When I see “Negative Data”, I immediately wonder – what is negated? Is it a hypothesis previously formed within Arcadia? Is it a major assumption that the entire community holds? (That is, what is its “metascientific scope”?) Does the negative result present a quandary, or is a new explanation offered instead? (What is the “deliverable scope”?) In summary, it would be helpful to know, when browsing, what kind of scientific proposition the data is negating, and what I’ll stand to gain by reading the publication (e.g. a new explanation, a scientific dilemma to mull over, a study replication prospect, a new methodology to consider, etc.)

I recognize that tags won’t be able to capture much of the above information, but I can imagine 1-3 tags (e.g. subject, use-case, scope) being quite informative.

Shaurya Chanana on Sep 02, 2024

I think having an NLP-generated large repository of technical nouns and word-phrases would be a great starting point. Some kind of word-cloud-like structure that is weighted by field-relevance and how frequently they occur in various topics would be helpful. Then, for any new paper, it could be auto-tagged based on this large corpus of tags.

James Boyd on Jun 01, 2024

Selection

g tools that do this well?What shared benchmarks should the open science community consider to evaluate the success of different publishing models?If you like the idea of providing open feedback, consider weighi

Well, I think the best solution to this problem is a somewhat “longer-term” prospect, but I’ll raise it here nonetheless. I think the success of a publishing model pertains to its ability to directly facilitate advancement of science itself, as opposed to, say, “network-based” indicators that only approximate social adoption/discussion/sharing. Fundamentally, scientific theories/hypotheses/models are “algorithms” (defined as loosely as you like) that eat data and produce explanations/predictions, and any publishing model that helps gather better input and/or deliver better output should garner prestige for doing so.

As a highly simplified toy model, consider “research” that is quite directly related to algorithms/data, such as quantitative finance: CrunchDAO (which is a kind of decentralized hedge fund…) actually has a leaderboard in place that ranks participants by the performance of their models. Of course, theories and models in many scientific fields are often not actual algorithms, and assessing their ability to best explain data would require a qualitative (though, nevertheless, standardizable) metascientific criterion. Fortunately for humankind, the realm of biology is more sophisticated than that of hedge funds :D And metascience will require criteria more sophisticated than “market returns”. Nevertheless, I wonder to what extent publishing platforms could ultimately be “metascientifically ranked” by their ability to gather the best data and develop the best explanations.

James Boyd on Jun 01, 2024

(A “better explanation” being one that is simpler, has fewer exceptions, covers more cases, enjoys better consistency with other explanations, etc.) To put it simply: metrics are a poor proxy for what could eventually be a standardized metascientific evaluation of work.

Claire Duvallet on Jun 03, 2024

Selection

licable)?has been verified?has been expanded or built on?How might we effectively track the ways a given piece of research is reused (i.e., others following up on a finding, applying the knowledge provided, using a tool, etc.)? Are

It might be interesting to have different types of citations — some citations are “this is a finding I’m referencing” and others are “this is a method/datatset from this work I’m using”

Robert Roth on Jun 04, 2024

This is a really cool thought (and I know some other folks are actively working on this, so hopefully we start seeing more of it soon). But I’m curious what the most effective way to display that would be. For instance, we and others use hover-over citations right now and there are a lot of different formats out there.

Do you think it would be most useful displayed directly next to the citation (or in the hover-over text)? Or would that be distracting? Or maybe encouraging writers to incorporate how they used a citation in the sentence/paragraph where it’s included? I could think of some more creative ways to display/communicate this info, but I wonder what would be the easiest to analyze without becoming overbearing.

Thank you for contributing to the discussion!

Rodolfo Aramayo on Jun 06, 2024

Selection

e the people encountering our work with a means to quickly and effectively evaluate its usefulness. While we don’t yet communicate any of this data to readers, we currently gather and analyze a variety of quantitative metrics, including:Metrics about individual pubsPubPub:PageviewsUnique visitorsCountry of visitors

My question is: Can we apply the same metrics to all publications, given that they can belong to different categories? Publications may introduce significant new biological concepts or bring novelty and clarity to a particular topic. They can also report important new technological developments and open new avenues for investigation.

Daniel Dunleavy on Dec 27, 2024

I think Rodolfo has raised an important point here. Usefulness, utility, impact - these qualities may all differ, not only be field of study, but by the type of publication or research produced. Some quick thoughts with examples might help illustrate:
- A publication articulating a set of standards (e.g., PRISMA for conducting and reporting systematic reviews and meta-analyses; ) may be useful to the extent that those guidelines or practices are followed/implemented in future publications or studies.
- An book or paper that is more conceptual than empirical may be useful or impactful to the extent that it has impregnated scientific/scholarly discourse (Kuhn’s SSR and Ioannidis’ 2005 paper come to mind).
- A study or experiment may be impactful to the extent to which is generates new knowledge. Not all studies are equal in terms of how they change our understanding or in terms of how rigorous/decisive they are in refuting or supporting a claim. Discussions by Lakatos and Mayo come to mind here.
- Still further, we can think of a study or experiment that changes human behavior or society broadly (e.g., a study that leads to a direct shift in how day-to-day medical care is provided).
I know I have struggled with coming to terms with what impact means - previously opting for something along the lines of altmetrics over the oft-used citation count. However, I’ve come to appreciate that such metrics can also be gamed, and may well represent reach, but not necessarily impact. I suppose reach is a necessary, but insufficient condition for impact. This issue is nicely made in the original post above.

Rodolfo Aramayo on Jun 06, 2024

Selection

ts (forthcoming)Number of issues (forthcoming)Zenodo metrics: ViewsDownloadsWe also gather qualitative metrics that could indicate utility and rigor, such as responses to the survey that you'll find at the bottom of every pub and public comments on our platform. Tracking this data is helpful for researchers to determine who their work reaches, its quality, an

An important question to include in the survey is: Has this publication changed the way you think about biology or the specific problem in question? Alternatively, has this publication introduced you to a new method for performing an experiment?

Robert Roth on Sep 05, 2024

This is a really cool idea! We did something like this on our ‘The experiment continues’ pub, where we adapted the question to ask if it’s changed the way the reader thinks about publication/will approach their own publications. I like the idea of having a more direct indicator of a change in thought process. It’s probably worth thinking about having a ‘base’ survey so that one can have a consistent set of data to compare across publications, but also introduce more specific questions that pertain directly to the work (like the ones you’re suggesting). Thank you for adding to the discussion!

Alden Conner on Jan 17, 2025

Agree with Rodolfo’s comment - would also be great to have follow up, or at least encourage people to return and describe how they altered or expanded an experimental protocol based on the publication.

Rodolfo Aramayo on Jun 07, 2024

### How can we measure and communicate the impact of science?

The discussion elaborates on measuring and communicating the impact of science through different classes of publications, focusing on Reviews, Research Papers, and Methods Papers.

1. **Classes of Publications**

- Reviews

- Research Papers

- Methods Papers

2. **Evaluating Reviews**

- Comprehensive coverage

- Recency and up-to-date information

- Use of extensive research publications

- Risk of "Review of Reviews" and error propagation

- Can we create a metric for evaluating reviews based on the ratio of original research publications cited versus the number of reviews cited?

- \*\*Summary for Reviews\*\*: Their impact is gauged by comprehensiveness, recency, and the extent of original research citations, avoiding the pitfalls of relying too much on other reviews, which can propagate errors.

3. **Evaluating Methods Papers**

- Wet-Lab Methods Papers

  - Accessibility and availability of reagents and materials

  - Reproducibility

  - Challenges in replicating experiments

    - Using the same reagents and conditions as the original method

    - Getting the same results

    - Importance of internal controls

- Computational Methods Papers

  - Computational Environment and Software Availability

    - Can we use the same computational environment?

    - Can we use the same software?

    - Is the software used accessible and available?

  - Challenges in replicating experiments

    - Importance of internal controls

    - Issues with computational reproducibility

- Researchers are more likely to test a protocol or a pipeline using their own data. Researchers are unlikely to use the data described/used in the original publication. Therefore, it is very hard to evaluate to what extent we were able to reproduce and replicate a given described experiment protocol or computational pipeline.

- In addition, in the case of computational pipelines, researchers are likely to use the latest version of software or scripts and/or genome files, thus making a clean and unbiased judgment regarding the reproducibility of a given computational pipeline very suspect and potentially biased.

- \*\*Summary for Methods Papers\*\*: Judged by reproducibility, including the accessibility of reagents and materials and the ability to replicate procedures in both wet lab and computational contexts. Challenges include not using original conditions and lacking proper internal controls.

4. **Evaluating Research Papers**

- The evaluation of research papers is inherently complex.

- Rarely do reviewers have expertise in all experimental aspects of the paper.

- Research papers can potentially be evaluated at different levels.

- Deconvolution of reproducibility:

    - Figures

        - Is the primary data used to generate the figures readily available?

        - Has a Python/R/Shell script that can regenerate the figure been provided? Scripts in R or Python (pandas, seaborn, matplotlib) can be generated to reproduce most figures.

        - If the answer to the above questions is yes, then the figure should be eligible to be assigned a DOI.

    - Tables

        - Is the primary data used to generate the tables readily available?

        - Has a Python/R/Shell script that can regenerate the figure been provided? Scripts in R or Python (pandas, seaborn, matplotlib) can be generated to reproduce most tables.

        - If the answer to the above questions is yes, then the table should be eligible to be assigned a DOI.

    - Materials

        - Have the drugs, reagents, and other materials been reported in detail?

            - Can these drugs and reagents be obtained? Are they available?

            - How likely are these drugs and reagents to be affected by batch production effects?

            - Are the animals used in the research available?

            - How robust is the phenotype of the animals used in the experiment? What is the probability that a variable genetic background will affect a given phenotype used in the experiments?

    - Methods

        - How detailed is the description of the methods?

        - Are the reagents, antibodies, oligonucleotides, and dyes used readily available?

        - Can the experiment be reproduced using the same materials?

        - How dependent on specific hardware and/or specific laboratory equipment are the methods?

    - Data availability and integrity

        - Is the primary data used in the manuscript readily available?

        - Primary data from repositories like NCBI SRA

            - Is the data processing affected by issues related to metadata stripping in sequencing files during submission?

    - Scripts and commands documentation

        - Availability and documentation of processing scripts

        - Documentation of pipeline and virtual environments

        - Availability of scripts in R or Python (pandas, seaborn, matplotlib)

- \*\*Summary for Research Papers\*\*: Unique due to their combination of new observations and methodologies. Evaluating their reproducibility involves ensuring primary data availability, processing it according to the original study, and producing equivalent figures. Challenges include data integrity from repositories like NCBI SRA and proper documentation of processing methods. Reproducibility also hinges on the availability of experimental materials (reagents, antibodies) and proper documentation of data processing methods. Sharing primary data and scripts (preferably in Python) for figure generation enhances understanding and reproducibility, potentially leading to granular micro-publications with unique identifiers.

5. **Evaluating Supplementary Data**

- Problems with supplementary data storage and accessibility

    - Issues with broken links and deleted data

- Solutions for better data preservation

    - Using repositories like Zenodo or Figshare

    - Dissociating supplementary data from main publications

    - Exploring different publication models (GigaDB, GigaScience)

- \*\*Summary for Supplementary Data\*\*: The storage and accessibility of supplementary data present significant challenges. Issues with broken links and deleted data necessitate better preservation solutions, such as using repositories like Zenodo or Figshare and dissociating supplementary data from main publications. Different publication models (e.g., GigaNDB, GigaScience) offer various approaches to addressing these challenges.

6. **Evaluating Data Integrity After Publication - The FASTQ File Headers Issue**

- To save disk space, NCBI-SRA started removing metadata present in headers of FASTQ files containing link number, tile number, X and Y coordinates.

- How important are FASTQ file headers for quality control?

- What information is being lost by NCBI-SRA redefining the FASTQ headers?

- Is it worth preserving this information in another format to be able to reattach to the downloaded files if necessary?

- See \[GitHub discussion for details\](https://github.com/ncbi/sra-tools/issues/130)

- \*\*Summary for FASTQ Files\*\*: A specific issue is raised about FASTQ file headers, especially in RNA-seq datasets. A recent GitHub discussion highlighted the importance of headers containing metadata like link number, tile number, and X and Y coordinates, which are stripped upon submission to NCBI SRA. This loss of information hampers the identification of PCR artifacts and read quality assessment. The need for alternative ways to preserve this metadata before submission is emphasized to ensure the replicability of original results. The removal of header information from 10x Genomics files at some point further complicated data reanalysis, underscoring the necessity of methods to preserve and reattach this metadata if necessary (although this might not be a current issue for 10x Genomics data).\

Note The content of this comment was developed by first recording the ideas. ChatGPT-4 then transcribed the recordings and used these transcripts to generate an outline. The outline was subsequently manually edited and refined.

L. Robert Hollingsworth on Jun 13, 2024

Selection

them. We’re developing ways to display metrics on our publications that reflect utility and rigor. But we’re still figuring out the best form for that to take. If you have thoughts on what would be useful for you to see, please leave a comment here or on ques

This could take the form of badging as proposed here (and I’m sure other sources, too): https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002234

Astera or other nonprofit orgs like eLife could organize badging principles/practices and deploy technical experts for rigor, or others for open science audits. ASAP (Aligning Science Against Parkinson’s) has strict open-science requirements and a template for materials sharing audits that could be helpful.

Also relates to “is verifiable” in the discussion, here: https://research.arcadiascience.com/pub/open-question-measuring-reuse#nfd6bck2cia

L. Robert Hollingsworth on Jun 13, 2024

Selection

g the knowledge provided, using a tool, etc.)? Are there existing tools that do this well?What shared benchmarks should the open science community consider to evaluate the success of different publishing models?

Engagement in the new publishing models is key. (1) Importantly, how many **collaborative** studies have been published in these new forums? It’s much easier to make radical publishing decisions by oneself than inspire colleagues to take risks. How do we incentivize groups to build-in public release of their study through these models at the design phase of a project? The open science community is large but spread-out, so it might be important to get folks together that are experts in different disciplines/methods to foster collaboration aimed explicitly at publication experimentation.

(2) Are people reviewing, and getting reviews? What are the quality of those reviews? Are they leading to meaningful changes in the research studies? While traditional peer review has its downsides, the upsides are a level of deep reading/scrutiny that is unusual when reading literature daily (I’ve spent upwards of 10-15 hours reviewing papers!). How do we get a mix of the things readers can catch quickly—but also incentivize deep review? Is there a way to build-in incentives for peer review, such as the ability to request verified reviewers in exchange for needing to review (e.g., a token system where you must review 2-3X more than request? The tokens could somehow scale with the depth of review?)

(3) Ultimately, are papers being read and informing new studies? Citation and other quantitative metrics give some hint at this but take a long time to measure and can be biased.

Ethan Beswick on Aug 30, 2024

There is always a moment (or years) when research would be logically applied within it’s own microcosm and I think that may be more true in a world where there are less structural barriers to publication/writing.

A Metric/visualization mechanism that allows for the velocity + depth of the research with differentiation (as others have suggested) between validation vs. expansion would appear extremely useful. If we’ve moved past the construct of “only experts” being able to validate an idea before publication then the overall impact needs to be more than sheer ‘number of comments’ as this would appear easily manipulatable.

The nature of “expanding” on the research has value both within the field of intended research and outside of it. Separating those two is often necessary in my own experience as research goals and expectations of outcomes are often very different. This could enable separate scoring mechanisms on “impact” and separate the noise of the discussion of progress in very different arenas.
Language/specificity: If you are looking for the broadest possible audience, then some mechanism for allowing translation of the research for different fields of research and languages would likely be useful.

From a pure language perspective, translation would seem simple and easy to incentivize given how widespread geographically the open science community is.
Purely commenting on how this paper can be improved or it’s faults without response/validation of the logic in those comments can be an equally slippery slope of manipulation. If commentary on research will be public, then to an extent, the validation and citation of those ideas could be incorporated into future research to help substantiate the thought process. That way users have an incentive to give great feedback that is intelligible, constructive, and usable in future research. This is then usable both from a search mechanism but also user curated “this was useful for my research in [field X].”

Robert Roth on Sep 05, 2024

Thank you for taking the time to comment — you bring up a lot of interesting points here. I’m especially curious about your second point. When you say ‘translation of the research for different fields,’ how do you imagine that looking? Would it be translating specific terminology that would be more easily understood by someone in a different field, or a larger overhaul of the work?

Unain Ansari on Sep 16, 2024

Selection

built on?How might we effectively track the ways a given piece of research is reused (i.e., others following up on a finding, applying the knowledge provided, using a tool, etc.)? Are there existing tools that do this well?

I think an important need in science publishing to make it more transparent and accessible is publication of “negative” results. Often research is bogged down by people repeating things that take extensive time and resources that do not end up working out in the end. Open science community needs to be more forgiving and transparent in sharing not only important discoveries but also things that do not work out, so people can build from it and not start from scratch.

Robert Roth on Sep 17, 2024

Thank you for adding to the discussion! I agree that increased publication of negative results is a major need (and something that Arcadia strives for in our publishing experiment). Another point about negative results that I think about is how we can make them more findable and accessible so that a given researcher may find them before they start their work. I'm definitely curious about new ways of approaching that as well.

Bethany McCarty-Kirkman on Sep 18, 2024

Selection

ts on what would be useful for you to see, please leave a comment here or on question number one! What else do we want to measure?While useful, many of the metrics above simply indicate reach (e.g. pageviews) or move at a pace t

Have you considered partnering with a vendor to track how its sales change following a publication? I’m thinking that if someone is trying to replicate the same experiment, they may opt to use your exact vendor for reagents.

Have you considered adding an “endorsement” feature? Once a scientist has used and validated your catalyst/method/technique/code, they can “endorse” the publication.

Robert Roth on Sep 19, 2024

This is a really interesting idea! I could see this being a good signal of reuse, provided that we’d be able to detect a (potentially small) spike against the background volume of sales/requests/purchases/etc. and have the confidence to correlate it with the publication. I imagine it would be easier with lesser-used materials versus more common ones.

As for the endorsement feature — it’s something we’ve considered, but maybe with slightly different language. Something like a ‘worked for me’ (à la protocols.io) or ‘this was useful to me’ button could be a great way to immediately show utility. I’d welcome any thoughts that people have about what would be most useful to see on a pub, or what kinds of ‘ratings’ might be most effective.

Jennifer Ramirez on Sep 28, 2024

Selection

ons to provoke discussion:In the absence of editorial decisions, what data, tags, summaries, or other information would help you quickly determine if a piece of research is relevant to your interests and use cases

As a visual learner, one thing I enjoy is when publications have is a graphical abstract. These can increase initial engagement to an article for inter-disciplinary individuals that might typically see a wall of text and be drawn away.

Robert Roth on Oct 01, 2024

Thank you for your comment! I’m curious — do you tend to look at the graphical abstract (when one is available), before or after reading the abstract/introduction?

Micah Johnson on Jan 13, 2025

To determine a publication’s relevance, I first look for key terms and a graphical abstract before reading the introduction. For these types of publications, clickable key terms could be especially helpful and may also encourage better engagement with the platform.

Victor Holmes on Oct 03, 2024

Selection

e a useful marketing metric, but it doesn’t reveal much about our science or its impact on its own. We need new ways to assess the utility of our work, ensure the feedback loop is fast enough to improve it, show scientific value to readers so they can quickly assess if a pub will be useful to them, and indicate how public feedback influenced our science.What could we measure that would be more informative, and how would we collect that data efficient

I admire deeply how you’re setting up a new system for sharing science - I hope the community can help you craft a model that works. I’m worried about the incentives: Why would other scientists give you the time it takes to comment, review, offer feedback? Cynically I think willingness drops off with seniority.

Consider moving to a licensing model for your information? A no-money “paywall” that gives access to pubs only to registered users. Access to the next pub requires feedback about the last or by spending ‘community credit’, which is earned through comments on other pubs, feedback on how pubs were used, etc. This could even be filling out a quick survey.

This is still basically free and accessible to anyone, but the data you might gather through (now somewhat mandatory) feedback will tell you what is impactful. Willingness to spend community credit to, say, read past the abstract, will give you information about the interest level in a pub. Later, ratio of citations to how many accessed tells you something about a pub’s impact.

Also cynically, I believe people treat something proportionally to what it cost to get it. Asking for just a little bit of community engagement in exchange for your pubs would both increase users’ perception of their value and get you much more of the feedback you’re seeking.

Robert Roth on Oct 04, 2024

Thank you for your comment and thoughtful response. The challenges of incentives and engagement are ones we're actively exploring, and your suggestion is an interesting one.

As outlined in 'Publishing v2,' we're experimenting with new ways to empower our scientists to engage with the community and foster open dialogue. This process used to be top-down, so we’re in the early stages of building out what those tools will look like.

Briefly, our aim is to release high-quality, useful science such that providing feedback becomes a mutually beneficial exercise. We believe that when scientists find our work valuable and readily applicable to their own research, they'll be more inclined to contribute their expertise and help us improve the work that they’re utilizing.

We also actively try to model this behavior by providing feedback on other researchers' preprints (over 2,000 comments across more than 500 preprints so far!). We believe that demonstrating the value of public comments can create a positive feedback loop that accelerates scientific progress, and we’re already seeing evidence of this.

While a licensing model could be an effective incentive for some, I do have a few reservations that I’d be curious to hear your thoughts on:

Accessibility: We're committed to making our science accessible to everyone, regardless of their ability to contribute feedback. Introducing a "paywall," even one based on community credit, could create a new barrier for scientists with limited time or resources.
Machine readability: As part of being open, we want to move toward making sure our work is machine-readable to better take advantage of the rapidly evolving AI/ML space. A licensing system could make it more difficult for others to use our work in this way.
Diverse perspectives: We believe valuable feedback can come from anyone, regardless of field or seniority. A system that prioritizes feedback from those who have the time/resources to contribute in a way that gets them enough credit could inadvertently skew the perspectives and could lock out some of the ‘one-off’ visitors to our science.

There are many avenues to explore in this area, and I agree that it would make measuring the impact of our work much simpler, but I think the downsides may not be worth it. I’d be really interested in keeping this conversation going if you or anyone else reading this has additional thoughts.

Alden Conner on Jan 17, 2025

This discussion really gets at the heart of the matter and demonstrates why these efforts are so necessary. I agree that any type of barrier undermines the commitment to open science and accessibility - we need to change the incentives globally, not just on a case-by-case basis (eg access to a given publication requiring a certain level of input).

I think part of this will require building new incentives into existing structures, such as traditional publishing and grantmaking. We’ve seen this already with many publications requiring the inclusion of code and data in a publicly-accessible format, or grants with requirements for open publishing. However, in the long run what is needed is culture change across scientific disciplines to promote open science as not only the right way to do science but the best way. One important factor in this change is making open science easy, so that it does not feel like a burden and becomes part of the day-to-day routine of scientists. The more people use truly open practices, the more others will benefit from the engagement and adopt those practices themselves. Rather than trying to fit this model into today’s incentive structure that is still fundamentally based on a “publish or perish” mentality, we need to imagine an entirely new incentive structure where open science is the default and the benefits of collaboration and engagement are the incentives.

Dhananjay Kumar on Dec 09, 2024

One idea that could enhance the visibility and accessibility of research is to create a dynamic, flowchart-like visualization—a simplified network graph—that connects a current result, data, or publication to the relevant prior works it builds upon (via incoming lines or nodes). Additionally, it could illustrate the potential impact of the research on various fields, labs, proteins, or disease areas (through outgoing lines or nodes). This network map should be placed prominently, near or alongside the abstract, offering a clear and concise overview of the relevant research landscape. It would allow the scientific community to quickly grasp the context and significance of the work.

Such a map would be especially useful for those who may not be experts in the field but have a general understanding. Establishing these connections can otherwise be time-consuming and complex. By providing this network, tracking key publications and labs working in the field would become much easier for newcomers.

Furthermore, this model could serve as a metric to gauge the impact of the publication. By monitoring the number of clicks on specific parts of the map, researchers could gain insights into which works and applications are being discovered and how the research is influencing other fields.

Robert Roth on Dec 10, 2024

Thank you for adding to the discussion; this is a really interesting idea! There’s a lot to think about here (like how to display this information to keep it from becoming overwhelming), but it’s definitely something to explore. I’d be curious to hear if you or anyone else reading this has any ideas, implementation-wise, about how to make this maximally useful. I know there are different groups working on this general idea.

Thank you again!

Sydney Ku on Dec 14, 2024

Selection

Given the wide scope of techniques and even wider scope of applications of these techniques, it could be useful for readers to easily access some basic contextual information about usage. My first intuition of relevant context usage is 1) what field is the user coming from? 2) In what industry is the user applying the technique or effect? Given that growth and usage of open access appears to be disparate across scientific fields (Severin et al., 2020), this information could be beneficial both for readers and Arcadia’s tracking of reach and impact. Further, this could provide demographic information in where open access science is being used the most and which disciplines are viable targets for improving reach. This information could be easily accessed by asking PubPub users to provide their basic field and industry when they create their account, then creating an interactive feature for each Pub (something like “I used and endorse this”).

I envision a sliding-scale graphic with relative proportions of field and industry usage, respectively, with a composite number indicating exact frequency (i.e. if 20 people have used the technique, the number 20 appears on the end, with a sliding scale indicating what proportion of those users come from what disciplines and in what industries). This information not only could be used by readers to determine if those in their field use the technique and find it helpful, but also identify barriers to usage on a discipline level that could encourage cross-industry and interdisciplinary conversation and collaboration.

Robert Roth on Dec 16, 2024

Thanks for the comment! This is an interesting idea. If we had enough engaged researchers (who confirm usage) providing their field and industry, this could be a really helpful data point to display on a publication. I’m curious about ways we might encourage people to leave an endorsement (or critique) after they’ve tried something. As an example, the ‘Works for me’ button on protocols.io is a more basic version of this, but at Arcadia we haven’t seen much activity on it. I’d be curious to know if others who publish on protocols.io see those ‘Works for me’ endorsements come in and if they’re doing anything to solicit those responses.

Veronica Pagowski on Dec 28, 2024

Selection

In science (and in most contexts, really), I believe in the importance of balance. Overemphasizing benchmarks like citations and impact factors can foster a "publish or perish" mentality, leading to competitive environments that prioritize metrics over genuine scientific progress and healthy collaboration. However, benchmarks and metrics, when used judiciously, are valuable tools for evaluating rigorous and impactful research. I think it would be helpful to develop metrics in molecular biology that encourage personal improvement and progress rather than overall standing in a scientific field. For instance, tracking the number of citations or interactions with an article, relative to the number of readers who engage with an entire article versus just the abstract could provide insights into its accessibility and impact, pre-publication. This would be particularly useful for in-progress work such as articles posted on BioRxiv or similar platforms.

A similar balance is essential when considering modularity in scientific contributions. Smaller, incremental outputs, particularly within teams, can enhance collaboration and progress. But overly fragmented publications can hinder public engagement and lead to misinterpretation, as the broader context or significance of the work may be lost. The challenge lies in striking the right balance: fostering meaningful, modular contributions while maintaining coherence and accessibility for a broader audience.

Interactive formats, like discussion-based platforms or preprint systems with editable components (e.g., Google Docs-style commenting on bioRxiv), could help address these challenges. For example, modular in-progress publications could allow casual readers to focus on short, digestible articles while enabling those with deeper interest to explore detailed discussions or linked resources – cool website format, by the way!

A significant challenge in molecular biology is the lack of long-term infrastructure for organizing and distributing datasets. In my undergraduate work, I worked in Oceanography. I think this field is actually a great example of a field that maintains long-term publicly accessible datasets. Dedicated organizations like NOAA and Copernicus have established models for managing and sharing large datasets with scientists and the public. Molecular biology, I think, could benefit from additional entities to handle, for example genomic and cellular atlas data. A centralized, long-term data repository with integrative pipelines, managed by dedicated organizations rather than transient student researchers (we all have to graduate eventually!), would greatly improve accessibility and sustainability in the field.

Robert Roth on Jan 04, 2025

Thank you for this thoughtful comment! I definitely agree with a lot of what you’re saying.

I’m curious to hear more about balancing modular contributions with coherence. Have you seen any particularly good examples of this balance being achieved in practice? Or, what practices could help with this? With the increasing popularity of smaller and smaller units of publication, this is something I’ve been thinking about.

Alden Conner on Jan 17, 2025

Selection

eople learn about and use our findings quicker and will accelerate scientific progress as a whole.Utility: By breaking from rigid journal formatting, we can maximize usability and explore interactivity. Our data will be easy to find, access, use, and repurpose in ways we can’t predict.Rigor: We want public comments from anyone. Expertise lives everywhere, not just where you look fo

I think this actually undersells what is described below - there is an important distinction between “easy to find, access, use, and repurpose” and whether or not people actually find and use something. It’s possible to meet these criteria by merely posting in publicly-accessible spaces and granting open access, but those alone do not guarantee actual engagement (as you discuss further in this paper).

Robert Roth on Jan 23, 2025

Yes, this is definitely an important distinction to keep in mind. One of the main reasons we want to measure reuse is to ensure that we’re not only releasing science publicly, but that the work is actually useful and ends up in places where it can be used.

Alden Conner on Jan 17, 2025

Selection

work is rigorous and replicable)?has been verified?has been expanded or built on?How might we effectively track the ways a given piece of research is reused (i.e., others following up on a finding, applying the knowledge provided, using a tool, etc.)? Are there existing tools that do this well?What shared benchmarks should the open science community consider to evaluate the success of di

When the research includes code, having the code not only publicly available but open to collaboration on a platform such as GitHub is a great way to track this. Looking at the number of forks, PRs, and comments can provide real-time data on reuse. Ideally, authors will encourage not only reuse but contributions and new adaptations of the code, and these should all be linked and visible through GH.

Robert Roth on Jan 23, 2025

Definitely agreed! We’ll be experimenting more with pubs that are directly intertwined with GitHub repositories, so we’re hoping to see more indications of reuse on them. Thank you for commenting!

Contributors (A-Z)

Purpose

Share your thoughts!

Motivation

What do we measure so far?

Metrics about individual pubs

Metrics about linked resources

What else do we want to measure?

Weigh in!

Watch our follow-up discussion

Methods

References

Share your thoughts!

Provide feedback

Pub details

Table of contents

Note The content of this comment was developed by first recording the ideas. ChatGPT-4 then transcribed the recordings and used these transcripts to generate an outline. The outline was subsequently manually edited and refined.

Contributors (A-Z)

Purpose

Share your thoughts!

Motivation

What do we measure so far?

Metrics about individual pubs

Metrics about linked resources

What else do we want to measure?

Weigh in!

Watch our follow-up discussion

Methods

References

Share your thoughts!

Provide feedback

Pub details

Table of contents

**Note** The content of this comment was developed by first recording the ideas. ChatGPT-4 then transcribed the recordings and used these transcripts to generate an outline. The outline was subsequently manually edited and refined.

Note The content of this comment was developed by first recording the ideas. ChatGPT-4 then transcribed the recordings and used these transcripts to generate an outline. The outline was subsequently manually edited and refined.