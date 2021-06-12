



Background: In addition to the COVID-19 pandemic, government officials around the world had to face an growing infodemic that could cause serious damage to public health and the economy. In this regard, the use of information monitoring tools has become a primary requirement. Purpose: The purpose of this study is to test the reliability of Google Trends, a widely used information monitoring tool. In particular, this paper focuses on the analysis of relative search volume (RSV) and quantifies its dependence on the day it was collected. Method: Queries from February 1, 2020 to December 4, 2020 (Period 1) and February 20, 2020 to May 18, 2020 (Period 2) Coronavirus + covid RSV from December 8, 2020 Collected daily by Google Trends until 27th. The survey covers regions and cities in Italy and countries and cities around the world. The search category is set to all categories. Each dataset was analyzed to observe RSV dependencies from the date it was collected. To do this, we use the Gaussian distribution X i = X (σ i, x¯ i) by calling i the country, region, or city under study and j the date the RSV was collected. And showed the tendency of daily fluctuation. xij = RSV sij. If missing values ​​were revealed (outliers), the affected countries, regions, or cities were excluded from the analysis. If the anomaly exceeds 20% of the sample size, the entire sample was excluded from the statistical analysis. The Pearson-Spearman correlation between RSV and the number of COVID-19 cases was calculated daily, highlighting variations on the day RSV was collected. Welch’s t-test was used to assess the statistical significance of differences between the mean RSVs of different countries, regions, or cities in a particular dataset. The two RSVs were considered statistically reliable when t <1.5. If the trusted data exceeds 20% (trust threshold), the dataset was considered untrusted. The rate of increase Δ was used to quantify the difference between the two values. Results: Google Trends shows an acceptable amount of anomalies only for RSV in the Italian region (0% for both period 1 and period 2) and countries around the world (9.7% for period 1 and 10.9% for period 2). Is influenced by. However, the correlation between RSV and COVID-19 cases is found in these two datasets (Max | Δ | = + 625% in Italy and Max | Δ | = + 175% in countries around the world). Even there was a big change. In addition, only RSVs from countries around the world did not exceed the confidence threshold. Finally, a large number of anomalies registered with RSVs in Italian and international cities have made these datasets unusable for all kinds of statistical inference. Conclusion: During the period considered, Google Trends proved to be reliable only for research on RSV in countries around the world. The value of RSV was highly dependent on the day it was collected, so in future studies, authors will collect query data for several consecutive days and work with the average RSV instead of daily RSV. Is essential. Established trust thresholds are respected. Further research is needed to assess the effectiveness of this method.

Keywords: COVID-19; Google Trends; Google Trends Data; Google Trends Data Analysis; Social Science Research.

