Evaluation Criteria for Health Websites: Critical Review, , , and
The significant usage of health websites and their roles as diagnostic and therapeutic tools have increased the importance of evaluating their credibility. Health websites are evaluated using the criteria introduced in the health guidelines; therefore, this study aimed to evaluate the adequacy of these criteria.
Material and Methods:
In this critical review study, the guidelines for "Health Websites Evaluation" and "Website Evaluation in Other Subject Areas" were extracted using sensitive keywords from valid databases, classification, comparison and content analyses were performed using scientific methods designed in this study.
The results indicate that in terms of various components of health websites, the evaluation criteria are not adequate. Note that health website evaluation criteria are designed based on the evaluation criteria of other subject areas. Therefore, the criteria share problems similar to those of the guidelines for other subject areas, and they ignore the evaluation of the specific features of health websites. It is necessary to have reliable and accurate guidelines to evaluate health websites.
Due to powerful Internet tools, the world today is based on information [1, 2]. The use of the Internet in the health care industry has led to the provision of educational, diagnostic and therapeutic services that are high quality, low-cost, accessible and on time [3, 4]. In this atmosphere, the communication between "medical professors and students", "physicians and patients", "patients with each other" and, in particular, "health information applicants and information existing in the websites", have created special user diversity [4-7].
This platform includes a large amount of health information that is considered to be a potential opportunity to promote social health . Studies have shown that people in every age group search the Internet for health information and use their findings . Furthermore, a variety of texts, images, graphics, audio and video files and applications from different agencies, including governments, hospitals, universities, research centers, medicine manufacturers, medical equipment manufacturers, business organizations, public associations and non-accredited organizations, are loaded on various websites. This information bombardment distorts the minds of health information seekers and affects appropriate decision-making . Note that users' lack of medical skills affects their search for information on the Internet which, along with their low level of health literacy (which reduces the quality selection of authentic and credible sources and services) is a serious problem .
However, physicians and health care providers pay attention to the accuracy of information uploaded on the website and the provision of high quality services of eHealth. These providers would also be eager to benefit from the advantages of these virtual environments as an effective assistant, if the concerns are addressed . Therefore, considering the importance of obtaining accurate and high quality health information, it is necessary to validate websites and rank their confidence levels . To achieve this purpose, health websites evaluation is an attainable executive solution [13-15]. In this regard, a number of credible organizations have produced guidelines, including health evaluation criteria [8, 14, 16]. Some of these guidelines include the Health On the Net (HON) foundation , Medical Library Association (MLA) , MedlinePlus (NLM) , National Center for Complementary and Integrative Health (NCCIH)  and Food and Drug Administration (FDA) .
Currently, websites that use specific components, have been developed utilizing the most up-to-date technologies, and based on their services, they have been classified into different types . As the result of technological progress, health websites have experienced tangible usage changes compared with other websites; thus, to evaluate them, the following principal question must be asked, "Are the evaluation criteria presented in current guidelines adequate and updated?"
To answer this question, no in-depth comparative study has addressed the status, coverage, and adequacy of health evaluation criteria. Thus, this study aimed to determine their strengths and weaknesses. Considering the specific features and characteristics of these websites, a comparison was made between health websites evaluation criteria and the website evaluation criteria of other subject areas.
MATERIAL AND METHODS
In this critical review, the preferred reporting items for systematic reviews of PRISMA guidelines were used as models, and their generalities were used.
A variety of databases and search engines, including PubMed, ScienceDirect, Web of Science, ProQuest and Google Advance, were searched, according to the below-mentioned pre-specified search strategy.
A list of the search terms commonly used for website evaluations was obtained from published literature. These terms included Evaluation, quality, health information, credibility, reliability, accuracy, readability, Criteria, Website, Internet, electronic and eHealth.
The search query was: (quality OR credibility OR reliability OR accuracy OR readability OR evaluation OR assessment) AND (health information) AND (online OR Internet OR web OR eHealth OR e-Health OR cyber* OR electronic) AND (criteria OR criterion).
Full-text resources were available.
The guidelines were up-to-date.
The guidelines were defined through credible universal references.
The evaluation criteria were presented in the guidelines.
There were not enough evaluation criteria in the guidelines.
The definitions were ambiguous and irrelevant to the evaluation criteria used in the guidelines.
In relation to evaluation principles, the guidelines were not comprehensive.
The search results are shown in Fig 1. Resource extraction was performed using the following methods:
Forty-three resources from PubMed, 55 from ScienceDirect, 32 from Web of Science, 38 from ProQuest and 190 from Google Scholar were retrieved. After reviewing the titles and abstracts of the retrieved resources, 75 recurrent references were excluded. Then, 283 resources were carefully studied and compared with the inclusion criteria of this study. In terms of the exclusion criteria, the remaining 74 sources were studied by the researchers. During this process, 20 guidelines were selected for analysis.
Content Categorization and Analysis
Guidelines were categorized into two parts based on the nature of their usage. According to their nature, the guidelines were categorized into two groups. The first group included health website evaluation guidelines, which were named "Health Guidelines". The features of these guidelines are presented in Table 1.
The second group contains website evaluation guidelines in other subject areas, which were named "Other Guidelines". A list of features of these guidelines is presented in Table 2.
Study of the evaluation criteria
Because the names and definitions of the evaluation criteria used in the guidelines differed, for subsequent comparisons, it was necessary to assign unified names and definitions for each evaluation criterion. For this purpose, in each of the groups (including "Health Guidelines" and "Other Guidelines"), the following steps were taken.
First, the criteria that seemed to be conceptually similar and shared many common features were placed in the same category. In each of these categories the criteria were examined. Regarding the repeatability of the names assigned to the criteria having correct meanings, a proper name was chosen for each criterion. The results are shown in Table 3.
Then, based on the definitions that were commonly associated with each one of the criteria, a unified definition of the criterion's concept was proposed. The results of the definitions are shown in Table 4. Note that to present unified definitions, attention was directed toward the common points between definitions, and no attempt was made to provide accurate and perfect definitions.
Analysis of the evaluation criteria in "Health Guidelines"
Considering the specific features and characteristics of health websites, the adequacy of the evaluation criteria in the group of "Health Guidelines" was examined.
Comparative analysis of the evaluation criteria in the two groups of guidelines
Multiple use of each evaluation criterion in both guideline groups, including the group of "Health Guidelines" and the group of "Other Guidelines", was separately examined, and the results are shown in Table 5 and 6.
Multiple evaluation criteria in one group of guidelines compared with the second group were reviewed, and the results related to the status of health websites were reported.
Features of health guidelines
Features of other guidelines
The large number of evaluation guidelines [8, 38-40] indicated that there is a possible shared common fundamental structure, and it seems that this structure has been developed by researchers in light of the technological advances. According to the findings of the present study, no evolutionary growth is visible in these guidelines. Additionally, these guidelines are provided with individual, personal or organizational views and do not follow a valid and general instruction. The following reasons might explain this premise.
As shown in Table 3, numerous names were assigned to a virtually common concept associated with a criterion. Therefore, the names do not have a specific root. This problem is clearly visible in "Health Guidelines" and "Other Guidelines".
Features related to the names of evaluation criteria
General, ambiguous and simple definitions
In some "Health Guidelines", definitions are not clear enough, and in some of them, such as "Health On the Net Foundation", the definitions are too general. For example, in this guideline, the authority criterion is defined as "The qualifications of the authors" . Other criteria for this guideline are general in the same manner. In a number of "Other Guidelines", definitions are presented as questions that do not lead to the formation of a precise concept of criterion in the mind of the evaluator. One of these guidelines is "Health Direct Australia" . For the mentioned reasons and also due to the lack of unified definitions in the resources, to make a comparison between different criteria, unified definitions were created, as shown in Table 4.
In some cases, in the "Health Guidelines", a single criterion, based on the definitions provided in Table 4, includes the definitions of several criteria.
The definitions provided in Table 4 have led to certain notions that form the basis for the judgments in this research. In some instances, based on these definitions, interventions between the concepts of health guideline criteria have been observed. For example, in "MedlinePlus", purpose also conveys the concept of bias, and they cannot be differentiated from one another ; in the "National Center for Complementary and Integrative Health", within the definition of the authority criterion, a definition of bias criterion is also presented .
Definitions related to the evaluation criteria
In some cases, within the "Health Guidelines", two different criteria, based on the definitions in Table 4, share similar concepts. In some instances, there are resembling definitions between two distinct criteria in the guidelines. For example, the definition of bias in "Dalhousie University's Kellogg Health Science Library" is the same as the definition of the authority in the "American Cancer Society" Guideline, which is referred to as "Who runs this website?" [23, 26].
Differences of selected evaluation criteria in different "Health Guidelines"
In Table 5, columns represent guideline codes and rows indicate selected criteria. It is obvious that a criterion is used in a specific number of guidelines. The results of this table indicate that in each guideline, a group of criteria have been selected, according to the supplier's opinion, and some of the criteria have been neglected. Some significant criteria that are essential for evaluating health websites have been ignored, and the reason for this neglect is unknown. Note that authority and currency have been of particular importance among the nine selected guidelines. However, privacy and purpose, which are particularly important for evaluating health websites, are not selected in three guidelines, and in four guidelines, accuracy and bias have not been taken into account. Audience has not been included in five guidelines. In addition, reliability, disclosure, usability, content, citation and link are rarely chosen.
Comparison of evaluation criteria in "Health Guidelines" Group.
Comparison of Evaluation criteria in "Other Guidelines" Group
Lack of a questionnaire
Although there are questionnaires for evaluating health websites in two guidelines HON foundation and college of Maryland university [17, 24], in eight "Other Guidelines", no precise questionnaire was found. In particular, if there is a questionnaire, the evaluation method of a website is determined. The absence of a questionnaire has made some researchers customizing questionnaires to evaluate health websites [53, 54], which has led to different reports of website evaluation results by different groups.
Unclassified evaluation criteria
In some studies of other subject areas, some criteria, including different groups of content, design, organization and usability, are presented hierarchically [9, 48], which contributes to the depth of the website evaluation, resulting in discovering dimensions of the strengths or weaknesses of the website . In the ten surveyed guidelines, this hierarchical view was not found. As shown in Table 4, these criteria are better classified into different groups. For instance, the definition of content in Table 4 contains two other criteria: currency and accuracy. Therefore, content can be a group name, and currency and accuracy are its objects.
Absence of weighted evaluation criteria
In none of the studies were the criteria weighted based on their degree of importance. Some of the criteria, compared to others, examined important aspects of a website's status, and they have more evaluation impact that should be specified.
However, the criteria presented in the "Health Guidelines" seem to be designed based on "Other Guidelines". In support of this claim, the following items can be mentioned:
"Other Guidelines" have a history, and website evaluations are initially performed in other subject areas, particularly in library science .
All existing problems in "Health Guidelines" are also found in "Other Guidelines". For example, the findings shown in Table 6 show the dispersion of the selection criteria in the "Other Guidelines", which were similar to the findings discussed in Table 5.
There is currently no consensus for the definition of eHealth in available resources . In a 2001 article, titled "The eHealth landscape: a terrain map of emerging information and communication technologies in health and health care," Eng stated that transferring health information with the aim of improving the quality of health services, by the use of Internet is a good definition for it . In a 2002 article entitled, "Personalizing medicine on the Web. E-health offers hospitals several strategies for success," Meyer and colleagues declared that services provided by eHealth can be categorized into four main groups: eHealth commerce, eHealth content, eHealth care and eHealth connectivity . In a 2002 article entitled, "Kundenbindungsstrategien von e-Health Services-Anbietern," Kirchgeorg and Lorbeer proposed that in addition to the four abovementioned groups, eHealth communities should also be taken into account . To supply these services, there are a variety of health website types, including medical websites, health portals, documental websites and information websites are used . The quality of health websites seems to depend on the quality of the services that they provide . Therefore, evaluating the quality of services provided by health websites determines the degree of their validity.
Note that the provision of such services can be carried out through various components. For instance, a medical website, providing medical services can use such components as teleconference, chat or electronic records . A patient portal needs components to send and receive information between patients, health care providers and electronic records [60, 61].
The subjects mentioned above suggest that a simple look at the evaluation of these widespread services, which has led to the creation of a variety of health websites, would not yield valuable results. This study found that the criteria in "Health Guidelines" are taken from "Other Guidelines" criteria, and they are not appropriate for evaluating various components in health websites. In this regard, in a 2016 article entitled, "Making Quality Health Websites a National Public Health Priority: Toward Quality Standards," Devine and colleagues acknowledged that, given the key role of health websites, there is no standardized and accurate criterion to evaluate them . In a 2010 article entitled, "Revisiting the online health information reliability debate in the wake of "web 2.0": an inter-disciplinary literature and website review," Adams likewise emphasized that for the evaluation of health websites, which have been transformed by technological changes and are providing new services, guidelines have not been updated .
According to some "Health Guidelines", the scoring systems of websites have been developed as "Quality Evaluation Tools", including HON Code, DISCERN, and JAMA benchmarks . Not being new and having some difficulties, these tools are utilized to evaluate health websites . Currently, the most comprehensive tool for measuring the quality of health websites, WebMedQual, evaluates specific dimensions of health websites . However, because important dimensions, such as accuracy and accessibility, are not evaluated by this software, it is not considered to be a perfect tool . The incompatibility and evolution of these tools, in relation to the development of health website services, are obvious . However, it seems that most of the problems with these tools are related to the shortcomings of the evaluation criteria in the guidelines already discussed in this study.
The compatibility of the evaluation criteria in the guidelines with the components used in the website should not be ignored. In this study, only one guideline (named "e-Health Code of Ethics") that defines the evaluation criteria of health websites compatible with recent components of health websites and addresses their differences was found. For example, in the content evaluation group, this guideline has distinctly emphasized medical information, information on medical results or information about the health care providers as different criteria. Additionally, commerce, privacy, care and other criteria, including their features, have been proposed. This guideline focused on the criterion of consultation with a specialist and how to communicate through website components, such as email or other facilities of a website .
Patients are not the only users of health websites; physicians and health care providers as specialists also pay attention to the quality of online services . Thus, true comprehension of health websites evaluation will lead to valuable results. Therefore, the following steps are recommended to design a systematic approach to evaluate websites.
First, it is necessary to describe eHealth services and the classification of health websites based on their services. Due to the disagreement among researchers, it is essential to conduct an in-depth study and come to a consensus [45, 58, 67]. In fact, the classification of health websites based on their services helps evaluators identify required components, apply appropriate evaluation criteria according to the type of components and evaluate the existence and adequacy of these components. For example, the components used in a patient portal as a type of website, have commonalities and differences with components used in a telemedicine website. These commonalities and differences have to be precisely defined, so that at the time of evaluation, the nature of the website and its functionality can be evaluated.
Afterwards, it is possible to introduce evaluation criteria based on each service group. Thus, a package of evaluation criteria associated with each electronic service can be provided. Therefore, if the specific service on any type of a website is usable, the relevant package can be called and used for evaluation. By providing these packages, it is necessary to solve the problems associated with the evaluation criteria that are discussed in this study. In particular, it is significant to determine a quantity scale for all the criteria and specify their weights. Finally, it is possible to design a comprehensive guideline with coherent evaluation methods.
The following benefits can be obtained from the steps mentioned above:
If new services are created in light of technological advances, the evaluation criteria for these services will be specified; thus, previous services and its evaluation criteria do not alter.
If there is a primary agreed method for evaluating each service, it can be gradually evolved. As a result of this evolution, the evaluation of the websites providing these services will be more accurate.
During evaluation, despite having reliable guidelines, the website type is identified, and based on its functionality and usability, the website is ranked and presented to the general public.
Health websites play a key role in providing health services and promoting social health. Therefore, the general public, including healthy people, patients, and particularly health care providers, are considered to be users of these websites. The remarkable usage of these websites and their intervention as diagnostic and therapeutic tools adds value to their evaluation when determining their degree of validity. This study found that the evaluation criteria of these websites, which are in "Health Guidelines", are not adequate enough to evaluate these new technologies. In addition, because there are no comprehensive and standardized guidelines for evaluating health websites , no meaningful comparison of the evaluation results can be concluded .
Obviously, if there is a reliable and accurate guideline for evaluating health websites, standards can be defined for designing these websites, and the infrastructure for automatically evaluating them can be provided by implementing evaluation software as robots; then, the evaluation results of these robots would be standardized in each evaluation and quickly updated based on website changes. Ultimately, the website type is identified, and based on its functionality and usability, it is ranked and presented to the general public.
The authors agree on this final form of the manuscript, and attested that all authors contributed in the final draft of the manuscript.
CONFLICTS OF INTEREST
The authors declare no conflicts of interest regarding the publication of this study.
No financial interests related to the material of this manuscript have been declared.