Catalonia's museums websites: Analysis and evaluation proposal

Ricard Monistrol, Cristòfol Rovira, Lluis Codina

  1. Introduction
  2. Research Methodology
    2.1. Data collection
    2.2. Analysis of Topics and Domain Type
  3. Quality Parameters.
    3.1. Accessibility
    3.2. Metadata tags results
    3.3. Positioning Results
    3.4. Website positioning results
    3.5. (X)HTML code quality results
  4. Conclusions
  5. Bibliography
  6. Annex: URL wesites analised


1. Introduction

In 2001, Bellido (2001: 231- 232) listed the advantages of museums' appearing on the Internet. For example, the ability to offer their information at anytime and place in the world, or the capacity for a museum to update its own contents without depending on graphic design companies (brochures, posters, etc.), along with the advantages of including multiple multimedia resources (text, image and sound) which can be offered to users around the world.

For this study, these aforementioned advantages and resources were already highly valued and used by various museums, mainly in the United States. In 1996, the Metropolitan Museum of Art (New York) took advantage of its presence on the Internet (Kotler, 2001: 251) by creating a new category called friends of the museum for only 50$ a year, offering exclusive resources such as the purchase of products on-line, free software or virtual tours of the museum or exhibitions.

Therefore, it is obvious that museums appear on the Internet (Kotler, 2001: 252) due to this very positive consequence: an Internet user can visit several museums at one time and from one place, which is incomparable to the pre-Internet era.

Now, however, the abundance of new websites in recent years dedicated to art and museums has highlighted the need for museums to review their websites as part of their publicity policy and as one of their elements of prestige. In this sense, many aspects influence the sites visibility and web traffic: a well selected domain name, adequate accessibility and a quality source code. These aspects may add to the museum's prestige, and help the site fulfil communication objectives by being able to reach a majority of the potential public.

This project presents the research results of the study on the websites of 68 Catalonian museums. To obtain this sample, we first had to select 93 total museums of the 154 that are registered in the Culture Department in Catalonia's Government (

In selecting these 93 museums we chose all those that were categorised as Science and Technology (Topic n.1), Natural Sciences (Topic n.2) and Art (Topic n.3) not including any museum that could not be classified in any of these three categories (ex. Numismatics, Biographies, Ethnology, etc.). By focusing on the three topics indicated we could include Catalonia's most important museums within one of the topics (Art, Science and Technology or Natural Sciences).

The decrease from 98 museums to the resulting 68 was made, first of all, by of course eliminating those that did not have websites, and also those whose websites were not operating for any reason (ex. the domain no longer belonged to the museum at the time of analysis), museums that appeared twice, and finally, we had to rule out six museums from the analysis since their pages were encoded in flash or java, not allowing the automatic analyser to complete the analysis. The following chart shows the list of the 68 museums analysed:


Chart 1: List of museums analysed

This study has several objectives: first of all, we hope to see what type of domain each of the 68 museums used: (1) website with own domain, (2) website without own domain, but with an acceptable amount of content, and (3) website without own domain and containing only one page. This first analysis was deemed convenient due to the influence from the types of domain URLs over the website's visibility. Furthermore, this aspect is also an indicator of the importance each museum places on their website within their publicity and advertising policy.

Secondly, we hoped to check each museum's website's quality level relative to various key parameters: (1) accessibility, (2) meta data, (3) positioning or visibility, (4) source code quality relative to the standards.

We feel that the 4 variables selected should be highlighted due to their capacity to provide measures for a site's general quality, and therefore to evaluate or measure the site's potential to fulfil the following functions: contribute to the publicity policy, support the museums' broadcasts, therefore increasing the public's awareness of the museums resources and characteristics.


2. Research Methodology

2.1. Data collection

All of the research data was obtained from a series of analyses performed in the first week of March, 2006. The initial 93 museums' data was obtained manually as well as with the use of a robotic analysis, DigiDocSpider (DDS), a web crawling-like computer program created by the Library and Documentation Sciences Department at the Pompeu Fabra University and the DigiDoc Research group at the Applied Linguistics Institute ( It has the ability to analyse the pages hosted in the server associated to the website's homepage URL, extracting and analysing elements of the source code that were previously selected. DDS has the ability to send the webpage's URL to other online validation services (XHTML, accessibility, CCS) to then compile the results and incorporate them in the report. Overall, DDS can automatically compile more than 100 relative indicators, among others, into four website parameters: Accessibility, meta data, search engine positioning (causes and results) and HTML code quality.

Once having analysed the 93 original registers, 25 of them could not be analysed for the reasons shown in Table 1.

Types of errors


No Website


Repeated Register


Failed links


Java or Flash applications


Blocked Server


Total: non-analysable


Total: analysed


Table 1: Type of non-analysable registers

As is easily seen, of the 25 museums that could not be analysed, the majority (13) were due to the simple fact that they did not have a website. It was also surprising to see that 3 of the cases with an "official" link offered by the government's register were broken or had changed owners. In 2 other cases, the server blocked access to the analyser (DDS) possibly for security reasons, and finally, the "extreme" use of flash or java blocked the analysis in 6 sites. It is also worth noting that in the last case, DDS's inability to analyse also affects the conventional search engine's like Google or Yahoo!, and therefore limits visibility and accessibility in these 6 sites, even though for this project we preferred to simply reject them from the list of sites to be analysed.

2.2. Analysis of Topics and Domain Type

Graph 1 shows the distribution of the 68 museum's analysed by topic, always in accordance with the categorisation of the Catalonian Government's museum directory:

T1: Science and Technology

T2: Natural Sciences

T3: Art


Graph 1: Classified by Topic

We can see that the Science and Technology museums are the most common (35), followed by the Art museums (26) and Natural Science (7).

Moreover, graph 2 shows the distribution of the websites by the following types of host:

D1: Website with own domain

D2: Website without own domain

D3: Website without own domain and only one page (that is, without additional content or other resources linked from this page).


Graph 2: Classified by type of Domain

In the second graph we can appreciate that the majority, 34, of websites have their own domains (like: " ").Their were 22 museums with their websites hosted in a subdomain (like" "), but with a sufficient amount of contents and variety comparable to that of those websites with their own domains. Finally, surprisingly 12 museums presented there content on a single webpage (like in where in March, 2006 there was only one page for all of the museum's website's content).

However, putting these museums with only one page aside, we can ask several questions: Up to what point does having one's own domain guarantee a better qualification in each of the four analysis parameters?

Relative to this hypothesis, we have deemed it convenient to analyse the museums' websites by type of domain instead of type of topic.

To achieve this, we performed a comparative analysis of the three types of Domains (D1, D2 and D3) for each of the four aspects to be analysed: Accessibility, meta data tags, search engine positioning (causes and results) and (X)HTML code quality.

The basic unit of analysis is the homepage (D1), the sub-site's homepage (D2) or the webpage (D3). Each of the pages cited were analysed by the DigiDocSpider (DDS), with the aim of obtaining the data from all the analysis parameters for each museum.

The results are expressed as a ranking for easy understanding, with the score between 0-10 for each of the four parameters analysed.

To create this ranking, we assigned a percentage value to each of the more than 100 indicators analysed by the DDS. Therefore, we could obtain a comparative evaluation of the museum registered by domain (D1, D2 and D3) and by each of the characteristics analysed (accessibility, metadata, position and XHTML code).


3.Quality Parameters.

3.1. Accessibility

Accessibility should be a basic element in websites for places like museums, where access is given to its information and contents (when available) to the general public. This public includes people with different physical and sensory disabilities that may impede access to a website or sub-site, or technological aspects, like hardware or software which impede their access.

To perform this analysis there are various tools online that automatically check various aspects of the source code relative to accessibility. The International Consortium W3C establishes three levels of priority for each of its indicators:

- Priority 1: minimum level of accessibility.

- Priority 2: intermediate level of accessibility.

- Priority 3: total accessibility

The results from the three online tools that were previously mentioned: Hera test, TAW and WAVE (Table 2 and Graph 3).

Accessibilit Indicadors




Errors (Hera 1)




Errors (Hera 2)




Errors (Hera 3)




Errors (TAW 1)




Errors (TAW 2)




Errors (TAW 3)




Errors (WAWE)




Total errors




Table 2: Domain accessibility errors


Graph 3: Total Errors by Domain

From the first level of analysis we could see that both Domain 1 and 2 were equal in total errors, with a slight decrease for D1. However, we can confirm that D3 had double that of the total errors than the others.

Even though at a deeper analytical level, with the Hera 1, the least demanding in accessibility, all domains passed with values less than 2. This is not the case with the TAW 1 indicators, whose values shoot up between 11 and 12 errors per page for D1 And D2 sites, and almost 23 errors per page for D3 sites.

Some differences were magnified in the Hera 2 and TAW 2 indicators, where they were sometimes nine times greater than in D1 (6.62 to 52.94) and nine times that of D2 (6.32 to 57.39). The values shoot up for D3 sites (5.92 to 110.97). We can not explain why these differences occur, since both the Hera and the TAW claim to be adjusted to the W3C standards.

If we analyse them by museum categories, we can see the ranking of the top ten domains. (Tables 3, 4 and 5).


Table 3: Classification of museums (D1) in accessibility


Table 4: Classification of museums (D2) in accessibility


Table 5: Classification of museums (D3) in accessibility

Anyways, relative to the museums' classification values in the tables, we can see that only the Museu Marítim de Barcelona (D1) and the Museu d'Història de Sabadell (D2) barely pass 5 points. The rest are below this value. Even though these low marks are more clearly seen in the D3 sites, where only the top ranking site, Museu de la Pell d'Igualada, barely passes 4 points.

3.2. Metadata tags results

Even though this is not a priority indicator for obtaining good positioning in the internet's search engines, it does offer the most exact way of identifying the resources' title, author, contents, key words... In this case, the museums' websites should be interested in distinguishing and showing their contents in a trustworthy fashion, adding value to the information.

So in the near future the increasing importance of the metadata tags should be taken into consideration. It is precisely the next generation of web language and design, the semantic web, will interact with the metadata tags with the hopes of recognising content and communicating it to a search engine or a user.

Now we shall go on to check the use of metadata tags introduced by each of the 3 domain types in the 68 selected museums. (Table 6):

If we notice the quantity of websites with metadata tags (Meta percentage), we can see that 80% of the three types pass (D1, D1 and D3). But we cannot say the same for the Dublin Core labelling, since only one web without their own domain, the Museu de Gavá (D2), has DC tags, while the rest are not available.

Metadata indicators




Meta percentage




Meta Dublin Core percentage




Meta overage




Meta DC overage




HTML author




Dc creator




DC description




DC subject




DC title




HTML description




HTML http-equiv




HTML keywords




Html robots




Tags RDF




Links RDF




Table 6: Metadata tag indicators by domain

How is the metadata presence spread in each webpage? The average of the number of tags in D1 and D2 is above 2 tags per page (2.38 in D1) and (2.68 in D2), but drops to 1.42 tags in D3. For these the tags most often used per page are the following, in this order: HTML http-equiv (D1, D2 and D3), HTML keywords (D1 D2 and D3), HTML description (D1, D2 and D3), HTML author (D1 y D2) and HTML robots (D1).

There are not any RDF tags, or access to them.

3.3. Positioning Results

A museum must try to reach a wide audience. This is the most important point seen in the internet. For example, if a possible user/visitor creates a search for museums in a search engine like Google or Yahoo! under contemporary art, the website should appear within the top positions in the results list or search engine ranking.

This web positioning can be achieved "ethically" with some of the indications suggested from Lluís Codina and Mari-Carmen Marcos (Codina, Marcos 2005).

For example, one of the most important indications is the number of links the website receives (or sub-site or webpage) from other websites. With the aim of quantifying this linking figure, Google has created a ranking (0-10) called Page Rank. It is calculated relative to the number of links a page receives along with the websites which give them.

That is why appearing in the large directories like Dmoz and Yahoo!, with high Page Ranks (PR) allows for the increase of this index in the webs included in this directory. Another way of increasing PR of a website is "inheriting" the PR from an institution that takes in a website, since part of its link is automatically transferred (like for universities, research centres, etc.)

This said, the following directories help us to provide analysis indicators for positioning relative to directories:

1. Dmoz domain: Percentage of domains included in the Dmoz directory.

2. Yahoo (di) domain: Percentage of domains included in the yahoo directory.

3. Google domain: Average number of pages indexed in Google.

4. Yahoo (bu) domain: Average number of pages indexed in Yahoo's search engine.

5. Dmoz page: Percentage of analysed pages included in the Dmoz directory.

6. Yahoo (di) page: Percentage of analysed pages included in the Yahoo directory.

But we also have indicators relative to the number of links a web receives from another web (entry) and the number that the web itself links to others (exit):

7. Entry links G: Per page average of the number of entry links according to Google.

8. Entry links Y: Per page average of the number of entry links according to Yahoo.

9. Entry links Y (ext): (Visibility) average number of external links to our page according to Yahoo.

10. Exit links (ext): Per page average of the number of exit links.

11. Exit links (int): Per page average of the number of internal Exit links.

17. Luminosity: Total exit links

We must also keep in mind that many source code tags play an important role in indexing the website or page. For example, the content's descriptive tags, the title, the alternative text within the images and the links (alt) and the link's title:

12. Empty titles: Percentage of pages with empty title tags.

13. Alt image: Percentage of images with the alt parameter.

14. Alt links: Percentage of links with the alt parameter.

15. Link's title: Percentage of links with the title parameter.

However, we must add one last indicator--any page with frames that make it difficult for search engines to index the content:

16. Frames: Per page average of the number of frames.

Having described all indicators, we continue to check the results of each of the museums by domain type. We also indicate the data's total score. We must note that for the average scores per domain we applied a weighted percentage to each of the 17 indicators. (Table 7):

Positioning causes Indicators





Dmoz domain





Yahoo (di) domain





Google domain





Yahoo (bu) domain





Dmoz page





Yahoo (bu) page





Entry Links G





Entry Links Y





Entry Links Y (ext)





Exit Link (ext)





Exit Link (Int)





Empty Title





Image alt





Links alt





Links title















Total score




Table 7: Positioning causes indicators by domain

Once the data has been calculated, we can confirm that these museums' presence in the Dmoz and Yahoo directories is minimal. Even though we can highlight that while the value is close to 1% in Dmoz, in Yahoo it reaches 0.23% (D2 as the highest).

Another negative aspect to consider is almost the complete lack of alt parameters in both the graphics and the links. It is precisely this element that improves the webpage's accessibility. Furthermore, it is also worth criticising that approximately 99% of all museums lack a title parameter in their links. This allows the search engines to refine their searches relative to the terms. In conclusion, this data allows for a greater website positioning.

If we go on to the other data, it is very interesting to see how the own domains (D1) (Google and Yahoo) are not as numerous as those on sub-sites without their own domain (D2). For example, Google has approximately 2,500 own domains registered under museums (D1), but 132,000 museums in sub-sites (D2). While it is also worth mentioning that even web pages without their own domain (D3) have more than 28,000 domains in Google.

While on a positive note, we can see that almost 100% of the websites analysed do not have any empty tags and almost no frames. This eases the search engines functions, allowing for greater positioning.

However, we should highlight that the average number of entry links are in D1 and D2, with respect to D3. Yahoo's data shows more than 500 for D1 and D2, but only 46 links for D3.

Here we will see the top 10 museums in positioning for D1, D2, D3. (Tables 8, 9 and 10).


Table 8: Classification of positioning causes D1


Table 9: Classification of positioning causes D2


Table 10: Classification of positioning causes D3

It is evident that the low introduction of some parameters notably decreases the websites' scores. Only D1 and D2 have museums that are above the average or appriach it (5 of 10 in D1 and 2 of 10 in D2). An interesting fact: the Fundación Miró (D2) has a better score than the Fundación Dalí (D1).

3.4. Website positioning results

In the previous "causes" sections we looked at the key parameters that research says are pertinent to positioning. Now we will validate their results. The indicators used are Page Rank, each page's or site's initial position in Google's and Yahoo!'s search results, with their own title tags as search terms. (table 11)

Positioning Indicadors




Page Rank




Position G




Position Y




Table 11: Positioning results by domain

No domain type reached average, just like we discussed earlier in the previous section on "causes." Anyways, lets look into the details of these results with the table of the top ten museums per domain. (Tables 12, 13 and 14).


Table 12: Positioning results D1


Table 13: Positioning results D2


Table 14: Positioning results D3

We can see how all of the top ten museums in each domain type easily passed the average. However, it is interesting to compare these positioning results lists with the previous positioning causes lists, in order to note changes in ranking.

For example, looking at the museums with their own domains we see the main art museums in Catalonia. If in D1 (positioning causes) the top 4 museums were

  • 1. Fundación Dalí

  • 2. MACBA

  • 3. MNAC

  • 4. Museo Picasso.

The results in positioning change: The only museum to keep its top position is the Fundación Dalí , since the MACBA drops to 5 th place, the MNAC drops from the top ten list all the way down to 24, and the Picasso goes down to 7 th.

3.5. (X)HTML code quality results

HTML code has been improved for browsing with the addition of XML codes, since they demand higher quality. Otherwise, the navigator can download the page and indicate an error message, increasing browser speed. Therefore, by incorporating the XML language, all of the information on the page is classified and linked hierarchically. This classification allows the page to supply content and topic information. This is a meta-language.

Now we are going to present the source code quality analysis for each of the museum's website and pages. The quality level is marked by element such as the presence of doc type tags or attributes within commas, not using tags not recommended by the W3C, writing tags in lowercase letters, the number of errors in the CSS style sheets (according to W3C) or the presence of broken links.

Table 15 shows the analysis results of the aforementioned elements as well as the total score, which is calculated with a percentage value given to each of the elements. Therefore, we find it interesting to check the level of XHTML language errors in graph 4. This may report, as we had indicated earlier, the lock-up of the browser in the future.

(X)HTML quality Indicators








No quotes




Non-recommended tags




(x)html errors




CSS errors




Lower case tags




Broken links




Total score




Table 15: (X)HTML code quality analysis


Graph 4: (X)HTML code error level by domain

The first interesting result is that while the D3 domains have the highest level of errors in the XHTML code (graph 4, twice as the others), they have the highest total score (table 14). Which indicator distinguishes it from the other domains? It is the almost complete lack of errors (0.83) per page in the CSS style sheets. Compared with 4.97 (D1) and 9.86 (D2) errors per page.

For the other indicators we should highlight the lack of Doc type tags. Only D2 barely reaches the 40% mark, while the rest are low like D1 at 32% and D3 at 8%. Even though D3 has more than 76 non-recommended tags by the W3C, D1 with 15 and D2 with 21 are also very badly situated, which make it difficult for browsing and for the search engines.

We have already noted the high level of errors in (X)HTML code throughout all domains (excessive in D3), but furthermore, it is unacceptable to see so many pages without the their attributes within quotes (D1 and D3 approach 13, while D2 with almost 22).

However, we can confirm that no pages analysed present broken links and the use of lowercase letters when writing tags is close to an 80% average.


4. Conclusions

Type of domain. As an answer to the initial hypothesis, there is no noticeable difference relative to whether or not a domain has its own domain or is in a sub-site, since the data obtained from the top 10 museums in both the D1 or the D2 offer similar quality variables in all parameters.

On the other hand, for D3 sites the situation is very different. These museums are hosted within a server (usually public institutions like the town hall or the city council), and have the worst scores in comparison to the D1 and D2 in all four aspects analysed: accessibility, metadata, positioning and (X)HTML code.

However, we must note the importance of having one's own domain for users to browse. For example, museums with names associated to the cities name, which would automatically be linked in visibility to its city on a search engine query. This combination could have the city and the museum get alternative publicity and obtain visits from cultural tourism.

So we must highlight the difficulty many town hall home pages present in terms of looking for the direct link to the museum or museums available in the city. One way of solving this problem would be with a direct and visible link to the museum's website, whether it has its own domain or is under the town halls.

However, to go from D3 to D2 or D1 has other advantages: It offers the necessary tools to improve its source code, accessibility and its positioning. The investment would be returned with a greater number of virtual visits, and as a consequence, greater real tourism to the museum and the city.

Furthermore, the availability of a private domain allows the owner to provide its own language and design as well as content on its website. Allowing for the presentation of a more appropriate objective relative to the museums own imaging strategy. While also allowing for the private domain to be saved under "favourites" in the user's browser.

On the other hand, it is worth noting that currently many well-known museums have taken advantage of their own initials (Macba, MNAC, IVAM...) as their own domain, since the initials are easier to memorise and promote. As a consequence, their URL is more easily recognised than a longer address (which is the case for museums without their own domain).

Quality Parameters. The research results show that in general, the low levels of accessibility in all of the museums is extended, since only a few museums slightly passed the average of 5 of 10.

We can see the same situation in the metadata indicators, even though it is not an urgent matter, it is important since in the growing framework of the semantic Internet, none of the museums' websites reach acceptable levels.

Anyways, when discussing the causes and results of popularity, we must separate the important museums (economically, number of physical visits, publicity campaigns and government support) from the other museums. For example, the Fundación Dalí, the Macba and the Picasso obtained good results both in causes and positioning results. We must finally note that the MNAC still has a lot to improve on in terms of positioning.

Conclusions: We must highlight the importance for all the websites to at least fulfil the W3C's A standard (the AA standard is recommended). Furthermore, we also consider it important to improve all of the museums' positioning, in both causes and results. Since a greater number of virtual visits depends on positioning, which is converted to real visits.


