How can we use big data from iNaturalist to address important questions in Entomology?

February 26, 2024

How can we use big data from iNaturalist to address important questions in Entomology?

There are more than 1,000,000 (1 million) known species of insects around the world. If we include species that are not described yet, the number of insect species is estimated to be around 5.5 million. If we think about individual insects, there are an estimated 10,000,000,000,000,000,000 (10 quintillion) insects on earth. Insects play a big role in human society, not just as pests and disease vectors, but also as pollinators and important members of our ecosystems. It is difficult to imagine what Earth would be without insects. Scientists speculate that a decline in insect populations will lead to a severe decrease in agricultural yield and thus in global food shortage.

There have been attempts to answer important questions in entomology using big data, such as whether insect populations are declining, how fast insect diversity is decreasing, how insects are responding to rapidly changing climate and extreme temperature fluctuations, and how agricultural pests and insect disease vectors are spreading. How do we deal with data on insects when insects exist on almost every land on earth and there are so many species in such large numbers?

Citizen Science can generate big data

One of the ways to generate and collect data on biodiversity and biogeography is citizen science. Citizen science is a collaboration between professional scientists and citizen scientists and involves having people participate in data collection or analysis for scientific research. For example, citizen scientists can report harmful algal bloom, report (and squash) highly invasive spotted lantern flies, or participate in the local Christmas bird counts. One of the largest citizen science projects that is generating huge amounts of data is iNaturalist.

What is iNaturalist?

iNaturalist is a “nonprofit social network of naturalists, citizen scientists, and biologists built on the concept of mapping and sharing observations of biodiversity across the globe.” If you have never used it, I highly recommend it. It’s like a Pokémon GO for naturalists. You can take a picture of an organism you see (it could be an animal, plant, or mushroom.)  and upload it to iNaturalist. The organism does not have to be rare or new, but the quality of the photograph should be good enough to identify the organism and the location information should be accurate. iNaturalist will identify the organism (or make suggestions) using the photo and location, and members of the iNaturalist community agree or disagree with the identification. If there is a confirmed identification of the organism, your observation may be categorized as “research grade” and can be used in research.

iNaturalist started as a Master's degree project of Ken-ichi Ueda, Jessica Kline, and Nate Agrin at the University of California, Berkeley's School of Information in 2008, and recently became an independent non-profit organization in July 2023. In 2023, iNaturalist reached 150 million verifiable observations and had 350,000 active users. The main purpose of iNaturalist is to help people identify plants and animals around them and learn about the natural world around them, not data generation. However, iNaturalist has a system in place to evaluate observations and whether they could be used in research, and this makes iNaturalist an exciting source of big data that can be used in research.

iNaturalist data helps advance science

Scientists are actively using iNaturalist as a platform to recruit citizen scientists and record data from citizen scientists on specific projects and as a platform for new discoveries and observations. New species continue to be discovered through iNaturalist observations. For example, a posting of an iris-like flower growing on a rock (Figure 1) led to a new species description by Manning and Goldblatt (2024). Early detection and rapid response against invasive species are important for controlling the spread of invasive species. iNaturalist is also an excellent platform to see the presence and the spread of invasive species in an area.

Figure 1. The posting on iNaturalist that led to a new species discovery (Moraea saxatilis).

Biases in iNaturalist data

To properly use data from iNaturalist, we need to understand biases that may be present in the distribution of species on the platform. The most observed species is a Mallard (Figure 2), and I can confidently say Mallard is not the most abundant animal on Earth. Larger organisms are easier to observe, so larger organisms are more represented in the data. More colorful insects are more likely to be observed than less colorful insects.

Figure 2. Species with the most observations on iNaturalist up to 2022 (from iNaturalist’s Blog).

iNaturalist species also carry a spatial bias. Observations are not distributed evenly. Globally, most locations in the United States and Western Europe had more than 1000 observations, while many countries in Africa, Canada, China, and Russia have fewer observations (Figure 3). On a smaller geographical scale, more observations are made near close roads or trails than less accessible locations, suggesting a “trail bias”.

Figure 3. iNaturalist observations are unequally distributed around the world (from iNaturalist’s Blog).

Data from iNaturalist needs to be used with caution. There is an ongoing effort to understand the differences and similarities between the actual species distributions and biogeography and what is observed and recorded on iNaturalist. One of the methods to validate iNaturalist data is to cross-check iNaturalist records with museum records. Traditional methods of surveying insect diversity, such as setting up traps and going through each sample to identify every insect can be labor-intensive and time-consuming. Data generated by iNaturalist can supplement existing methods of studying insect diversity and distribution and advance the field of entomology.


  1. Bayraktarov, E., Ehmke, G., O'Connor, J., Burns, E.L., Nguyen, H.A., McRae, L., Possingham, H.P. and Lindenmayer, D.B., 2019. Do big unstructured biodiversity data mean more knowledge?. Frontiers in Ecology and Evolution, p.239
  2. Bosenbecker, C., Anselmo, P.A., Andreoli, R.Z., Shimizu, G.H., Oliveira, P.E. and Maruyama, P.K., 2023. Contrasting nation-wide citizen science and expert collected data on hummingbird–plant interactions. Perspectives in Ecology and Conservation, 21(2), pp.164-171.
  3. Caley, P., Welvaert, M. & Barry, S. C. Crowd surveillance: Estimating citizen science reporting probabilities for insects of biosecurity concern. J. Pest. Sci. 93, 543–550 (2020).

  4. Callaghan, C.T., Poore, A.G., Hofmann, M., Roberts, C.J. and Pereira, H.M., 2021. Large-bodied birds are over-represented in unstructured citizen science data. Scientific reports, 11(1), p.19073.

  5. Fisher, S., Fisher, R.N. and Pauly, G.B., 2022. Hidden in plain sight: detecting invasive species when they are morphologically similar to native species. Frontiers in Conservation Science, 3, p.846431.

  6. Hochmair, H.H., Scheffrahn, R.H., Basille, M. and Boone, M., 2020. Evaluating the data quality of iNaturalist termite records. PLoS One, 15(5), p.e0226534.

  7. Loarie, S. (2022). We’ve passed 100,000,000 verifiable observations on iNaturalist! [online] iNaturalist. Available at: [Accessed 20 Feb. 2024].
  8. Loarie, S. (2023). 150,000,000 observations on iNaturalist! [online] iNaturalist. Available at:

  9. Loarie, S. (2024). iNaturalist January News Highlights. [online] iNaturalist. Available at: [Accessed 20 Feb. 2024].
  10. Geurts, E.M., Reynolds, J.D. and Starzomski, B.M., 2023. Not all who wander are lost: Trail bias in community science. Plos one, 18(6), p.e0287150.
  11. Manning, J.C. and Goldblatt, P., 2024. Moraea saxatilis (Iridaceae: Iridoideae), a new montane species from the Groot Winterhoek Wilderness Area of the Cape Floristic Region, South Africa. South African Journal of Botany, 165, pp.39-42.